A new cryptic species of Hyphessobrycon Durbin, 1908 (Characiformes, Characidae) from the Eastern Amazon, revealed by integrative taxonomy

Hyphessobrycon caru sp. nov. is described based on five different and independent methods of species delimitation, making the hypothesis of this new species supported by an integrative taxonomy perspective. This new species has a restricted distribution, occurring just in the upper Pindaré river drainage, Mearim river basin, Brazil. It is a member of the rosy tetra clade, which is characterized mainly by the presence of a dark brown or black blotch on dorsal fin and absence of a midlateral stripe on the body. Hyphessobrycon caru sp. nov. is distinguished from the members of this clade mainly by the shape of its humeral spot, possessing few irregular inconspicuous vertically arranged chromatophores in the humeral region, or sometimes a very thin and inconspicuous humeral spot, and other characters related to teeth count, and color pattern. The phylogenetic position of the new species within the rosy tetra clade was based on molecular phylogenetic analysis using sequences of the mitochondrial gene cytochrome oxidase subunit 1. In addition, a new clade (here termed Hyphessobrycon micropterus clade) within the rosy tetra clade is proposed based on molecular data, comprising H. caru sp. nov., H. micropterus, H. piorskii, and H. simulatus, and with H. caru sp. nov. and H. piorskii recovered as sister species. Our results suggest cryptic speciation in the rosy tetra clade and, more specifically, in the H. micropterus clade. We recommend the use of integrative taxonomy for future taxonomic revisions and species descriptions when dealing with species complexes and groups containing possible cryptic species.


Introduction
Hyphessobrycon Durbin, 1908 is a species-rich characid genus comprising about 160 valid species . It is widely distributed along the river basins of the Neotropical region, from southern Mexico to the La Plata River basin in northeastern Argentina (Carvalho and Malabarba 2015;García-Alzate et al. 2017;Guimarães et al. 2018). The genus was first proposed as a subgenus of Hemigrammus Gill, 1858 by Durbin in Eigenmann (1908), differing from the latter only by the absence of scales covering the caudal-fin. Hyphessobrycon was reviewed by Eigenmann (1918Eigenmann ( , 1921 in a work which still constitutes the most comprehensive revisionary studies on the genus. The large number of species included within Hyphessobrycon and the poor knowledge of the alpha and beta-taxonomy of species and species groups are among the major challenges for a more comprehensive taxonomic study and phylogenetic analyses of the genus. It is widely known that Hyphessobrycon does not constitute a monophyletic group (Weitzman and Palmer 1997a;Mirande 2010Mirande , 2018Oliveira et al. 2011;Carvalho and Malabarba 2015;Carvalho et al. 2017; Moreira and Lima 2017;Betancur-R. et al. 2018;Guimarães et al. 2018). Nevertheless, groups of species have been proposed based primarily on similarities of color pattern and other external features (e.g. Weitzman and Palmer 1997a;García-Alzate et al. 2008; Moreira and Lima 2017). Some of them are probably merely artificial operational assemblages to aid species identification, whereas others represent potential monophyletic groups, delimited by exclusive character states (e.g. Castro-Paz et al. 2014;Carvalho and Malabarba 2015;Guimarães et al. 2018).
In this context of integrative taxonomy, the present study aims to investigate the diversity within the rosy tetra clade sensu Weitzman and Palmer (1997a). This clade comprises around 30 species, including some species of Hyphessobrycon and other allied species, that are appreciated as aquarium fishes due to their attractive color patterns (e.g. Weitzman and Palmer 1997a, 1997b, 1997c, 1997dZarske 2008;Hein 2009;Guimarães et al. 2018).
This group has had its composition and name changed over the last decades, and a detailed taxonomic history is presented by Weitzman and Palmer (1997a). Two previous papers (e.g. Castro-Paz et al. 2014;Guimarães et al. 2018) applied molecular approaches to investigate the diversity of rosy tetra clade, and they suggested that its taxonomic resolution should be better investigated as it could include cryptic species or valid species which may have been synonymized. A new species of Hyphessobrycon and member of the rosy tetra clade is described from the upper Pindaré river drainage, Mearim river basin, a coastal river basin of the Eastern Amazon region, Brazil, based on both morphology and molecular data. Furthermore, a new clade, within the rosy tetra clade, is proposed based on the phylogenetic tree topology presented.

Materials and methods
Taxa sampling, specimens collection, and preservation Individuals collected for this study were euthanized with a buffered solution of MS-222 at a concentration of 250 mg L −1 for a period of 10 min or more until opercular movements completely ceased. Specimens selected for morphological analysis were fixed in formalin and left for 10 days, after which they were preserved in 70% ethanol. Molecular data were obtained from specimens that were euthanized, fixed, and preserved in absolute ethanol.
Specimens for morphological analysis are listed in type and comparative material lists. Specimens for molecular approaches are listed in Table 1. We also retrieved sequences from other species of Hyphessobrycon and allied species for a comparative analysis from the Barcode of Life Database (BOLD) and the National Center for Biotechnology Information (NCBI) databases (Table 1).

Morphological analysis
Measurements and counts were made according to Fink and Weitzman (1974), with exception of the scale rows below lateral line, which were counted to the insertion of pelvic-fin. Vertical scale rows between the dorsal-fin origin and lateral line do not include the scale of the median predorsal series situated just anterior to the first dorsal-fin ray. Counts of supraneurals, vertebrae, procurrent caudal-fin rays, unbranched dorsal and anal-fin rays, branchiostegal rays, gill-rakers, premaxillary, maxillary, and dentary teeth were taken only from cleared and stained paratypes (C&S), prepared according to Taylor and Van Dyke (1985). The four modified vertebrae that constitute the Weberian apparatus were not included in the vertebrae counts and the fused PU1 + U1 was considered as a single element. Osteological nomenclature follows Weitzman (1962 DNA extraction, amplification, and sequencing DNA was extracted from fin clips using Wizard Genomic DNA Purification kit (Promega) according to the manufacturer's protocol. Fragments of the cytochrome c oxidase subunit 1 gene (hereafter COI) from mitochondrial DNA were amplified, using the universal primers designed by Ward et al. (2005) for fish. Polymerase chain reactions (PCR) comprised a total volume of 15µl containing 1× Polymerase buffer, 1.5 mM MgCl 2 , 200 µM dNTP, 0.2 uM of each primer, 1U of Taq Polymerase (Invitrogen), 100 ηg of DNA template, and ultrapure water. The PCR cycles were as follows: 2 min at 94 °C, followed by 35 cycles of 94 °C for 30s, 54 °C for 30s, and 72 °C for 1 min, and 10 min at 72 °C. Amplicons were purified using Illustra GFX PCR DNA and Gel Purification Kit (GE Healthcare Systems) and sequenced using the forward primer by an outsourced sequencing service at the University of São Paulo, using BigDye Terminator kit 3.1 Cycle Sequencing kit in ABI 3730 DNA Analyser (Applied Biosystems).

Data partition, evolution models, and alignment
The dataset included the following gene: COI (680 Base pairs, BP). Sequences were aligned using ClustalW (Chenna et al. 2003). The DNA sequences were translated into amino acids residues to test for the absence of premature stop codons or indels using the program MEGA 7 (Kumar et al. 2016). In the alignment, gaps were coded with a dash (−) and missing data with a question mark (?), but during analyses, both were treated as missing data. Measure Substitution Saturation tests were performed in DAMBE5 (Xia 2013) according to the algorithm proposed by Xia et al. (2003). The best-fit evolutionary model (GTR+G) was calculated, using the corrected Akaike Information Criterion (AICc) determined by the jModelTest 2.1.7 (Darriba et al. 2012).

Species concept, species delimitation, and diagnoses
The unified species concept is herein adopted by expressing the conceptual definition shared by all traditional species concepts, "species are (segments of) separately evolving metapopulation lineages", disentangling operational criterion elements to delimit taxa from species concepts (de Queiroz 2005(de Queiroz , 2007. According to this concept, species are treated as hypothetical units and could be tested by the application of distinct criteria (species delimitation methods) (de Queiroz 2005(de Queiroz , 2007. It allows for any criteria to separately provide evidence about species limits and identities, independently from other criteria (de Queiroz 2005(de Queiroz , 2007. However, evidence corroborated from multiple operational criteria is considered to produce stronger support for hypotheses of lineage separation (de Queiroz 2007; Goldstein and Desalle 2010), a practice called "integrative taxonomy" (Dayrat 2005;Goldstein and Desalle 2010;Padial et al. 2010). Five distinct and independent operational criteria for species delimitation, based on morphological and molecular data, were implemented here: Population Aggregation Analysis (Davis and Nixon 1992) (hereafter PAA); DNA barcoding, as proposed by Hebert et al. (2003aHebert et al. ( , 2003bHebert et al. ( , 2004Hebert et al. ( a, 2004b) (hereafter DBC); a tree-based method as proposed by Wiens and Penkrot (2002) (hereafter WP, following Sites and Marshall 2003); a character-based DNA barcoding as proposed by Desalle et al. (2005) (hereafter CBB); and a coalescent species delimitation method termed the Bayesian implementation of the Poisson tree processes (hereafter bPTP, following Zhang et al. 2013). All species delimitation methods here adopted, except PAA, were performed on cytochrome c oxidase subunit 1 (COI) sequences, as it is a mitochondrial gene with fast evolutionary rate, suitable for single locus species delimitation approaches (Avise 2000).

Traditional DNA barcoding (DBC) and Phylogenetic analysis
We used the Kimura-2-parameters model (K2P) (Kimura 1980) to estimate the pairwise genetic distances between species in MEGA 7 software (Kumar et al. 2016). We used DnaSP v. 6 (Rozas et al. 2003) to estimate the number of variable sites and haplotypes. A Bayesian inference-based phylogenetic (BI) tree was estimated in MrBayes (Huelsenbeck and Ronquist 2001) plugin in Geneious 9.0.5 to reconstruct the evolutionary relationships among terminals using General Time Reversible (GTR+G) as evolutionary model. Bayesian tree inference was based in a chain length of 10 million, a burn-in length of 500,000 generations subsampling trees every 10,000 generations. We used a sequence of Hyphessobrycon flammeus Myers, 1924 as outgroup.

Wiens and Penkrot analysis (WP)
WP is based on the direct inspection of haplotype trees generated from the phylogenetic analysis having as terminals at least two individuals (haplotypes) of each focal species. In this method, the term "exclusive" is used instead of monophyletic, as the term monophyly is considered inapplicable below the species level (Wiens and Penkrot 2002). Clustered haplotypes with concordant geographic distribution forming mutual and well supported clades (exclusive lineages) are considered strong evidence for species discrimination (absence of gene flow with other lineages). When haplotypes from the same locality fail to cluster together, there is potential evidence for gene flow with other populations (Wiens and Penkrot 2002). Statistical support for clades is assessed by the posterior probability, considered as significant at values about 0.95 or higher (Alfaro and Holder 2006). When only one haplotype (specimen) from one putative population was available, the species delimitation was based on the exclusivity of the sister clade of this single haplotype, supported by significant values, allowing us to perform the test in populations with only one haplotype (Wiens and Penkrot 2002). In addition, the method allows recognition of nonexclusive lineages as species since their sister clades are exclusive and supported by significant values (Wiens and Penkrot 2002).

Character-based DNA barcoding (CBB)
The CBB is similar to the population aggregation analysis proposed by Davis and Nixon (1992), but directed to nucleotides as an alternative method for diagnosing taxa through DNA barcodes, as the original method is based on subjective cut-off distance measures to species designation (Hebert et al. 2003a(Hebert et al. , 2003b(Hebert et al. , 2004a(Hebert et al. , 2004b. This method delimits species based on a unique combination of nucleotides within a site shared by individuals of the same population or group of populations. In addition, species were diagnosed by nucleotide substitutions following Costa et al. (2014). Optimization of nucleotide substitutions among lineages of the Hyphessobrycon micropterus clade were obtained from the Maximum Parsimony topology, using TNT 1.5 (Goloboff and Catalano 2016). Maximum Parsimony analysis (MP) was obtained with the following parameters: traditional search, tree bisection reconnection branch swapping (TBR), 1 random seed, setting random taxon-addition replicates to 1,000, multi-trees in effect, collapsing branches of zero length, characters equally weighted, and 10,000 trees saved per replication. MP tree branch support was given by bootstrap analysis (Felsenstein 1985), using a heuristic search with 1,000 replicates and the same settings used in the MP search, saving a maximum of 1,000 trees in each random taxon-addition replicate. The analysis was rooted on Hyphessobrycon flammeus Myers, 1924. Each nucleotide substitution is represented by its relative numeric position determined through sequence alignment with the complete mitochondrial genome of Astyanax paranae Eigenmann 1914 (KX609386.1:5503-7062 -mitochondrion complete genome), followed by the specific nucleotide substitution in parentheses. The results of this analysis are presented in Suppl. material 1: Box 1 and molecular diagnosis section.

Bayesian implementation of the poisson tree processes (bPTP)
The bPTP is a coalescent phylogeny-based species delimitation method aimed at delimiting species based on single locus molecular data (Zhang et al. 2013). An advantage of bPTP is that it does not need an ultrametric calibration like other coalescent approaches, avoiding errors and computer intensive processes (Zhang et al. 2013). The method relies on the number of substitutions between haplotypes and assumes that more molecular variability is expected between species than within a species (Zhang et al. 2013). In our analysis the dataset was reduced to include only unique haplotypes from the species of the H. micropterus clade. Outgroups were restricted to Hyphessobrycon bentosi Durbin, 1908 and Hyphessobrycon copelandi Durbin 1908. Sequences were aligned using ClustalW (Chenna et al. 2003). The best-fit evolutionary model (GTR+G) for the reduced dataset was calculated using the corrected Akaike Information Criterion (AICc) determined by the jModelTest 2.1.7 (Darriba et al. 2012). The input phylogenetic tree was performed in MrBayes 3.2.6 (Ronquist et al. 2012), with the following parameters: independent runs of two Markov chain Monte Carlo (MCMC) runs of four chains each for 3 million generations and sampling frequency of 1,000. The bPTP analysis was performed in the Exelixis Lab's web server http://species.h-its.org/ptp/, following the default parameters except for a 20% burn in.

Diagnosis (PAA). The new species
Hyphessobrycon caru sp. nov. differs from most of its congeners, except members of the rosy tetra clade, by the presence of a dark brown or black blotch on dorsal-fin (vs absence) and absence of a midlateral stripe on the body (vs presence).
Furthermore, the new species differs from H. bentosi Durbin, 1908, H. erythrostigma, H. pyrrhonotus, H. rosaceus, and H. socolofi by presenting only one tooth in the outer row of premaxillary, and this unique tooth just slightly displaced from inner row [vs two or more teeth, displaced from the inner row]; from H. hasemani and H. micropterus by the dorsal-fin spot located approximately at the middle of the fin's depth, not reaching its tip (vs spot located approximately at the middle of the fin's depth, reaching its tip in adults); from H. hasemani by presenting tri to unicuspid teeth in the inner row of premaxillary and dentary (vs tricuspid or pentacuspid teeth); from H. piorskii by having the anal-fin profile usually nearly straight (vs anal-fin profile usually falcate). In addition, H. caru sp. nov. is easily distin guished from Pristella maxillaris (Ulrey, 1894), Moenkhausia hemigrammoides Géry, 1965, andHemigrammus unilineatus (Gill, 1858) by the absence of a black oblique stripe or band on the anterior por tion of the anal-fin (Fig. 1)

(vs presence).
Description. Morphometric data of holotype and paratypes are presented in Table 2. Body small (with maximum SL of 25.4 mm), compressed, moderately deep, greatest body depth slightly anterior to dorsal-fin base. Lateral body profile straight and downward directed from the end of dorsal-fin to adipose-fin, straight or slightly convex between later point and origin of dorsal most procurrent caudal-fin ray. Dorsal profile of head convex from upper lip to vertical through eye; predorsal profile of body roughly straight, dorsal-fin base slightly convex, posteroventrally inclined; ventral profile of head convex from lower jaw to pelvic-fin origin. Ventral profile of body straight or slightly convex from pelvic-fin origin to anal-fin origin; straight and posterodorsally slanted along anal-fin base; and slightly concave on caudal peduncle. Jaws equal, mouth terminal, anteroventral end of dentary protruding. Maxilla reaching vertical to anterior margin of pupil. Premaxillary teeth in two rows. Outer row with one unicuspid or tricuspid tooth, just slightly displaced from inner row; inner row with 6(5), 7(6), or 8(1) tricuspid teeth and one unicuspid tooth. Maxilla with 3(2) tricuspid teeth and two unicuspid teeth, 4(3) tricuspid teeth and two unicuspid teeth or 5(7) tricuspid teeth. Dentary with five (10) or six (1) larger tricuspid teeth followed by one smaller tricuspid teeth 5(2), 6(2), 7(3), and 8(5) smaller unicuspid teeth (Fig. 3).

Colour in alcohol.
Ground coloration light yellowish brown. Humeral region with few irregular inconspicuous vertically arranged chromatophores, sometimes very thin and inconspicuous humeral spot. Flank with chromatophores homogeneously scattered, more concentrated on posterior region to humeral spot, posterior region of dorsal-fin base origin and below mid-portion of trunk, between anal-fin origin and caudal peduncle. Ventral region lacking dark-brown chromatophores. Dark-brown chromatophores present on head and more concentrated on dorsal portion, becoming sparser on cheek and preopercular regions.
Dorsal-fin ground coloration hyaline, with conspicuous black or dark-brown spot located on anterior portion of fin, reaching about 6 th ray, approximately between one-half to two-thirds of fin depth. Anal and caudal-fins hyaline. Caudal-fin with a darker, usually dark brown, posterior margin and on its base. Adipose-fin hyaline to light brown, with dark-brown or black chromatophores more concentrated on its dorsal portion, depending on the specimen preservation state. Pectoral and pelvic-fins hyaline; pelvic-fin with variable amounts of dark-brown pigmentation remaining depending on the specimen preservation state.

Sexual dimorphism.
Mature males with small bone hooks on anal and pelvic-fin rays. Bone hooks absent on females. Anal-fin presenting bone hooks from 3 rd , 4 th , or 5 th rays to the last ray. Number of hooks variable, increasing from the first to the last rays. Pelvic-fin presenting 2 nd , 3 rd , 4 th , or 5 th rays with 5, 6, or 7 smaller hooks.
Etymology. The specific epithet honors the term "Caru". Caru is the name of an area (about 70.000 ha) inhabited by Brazilian native tribes from the ethnicities Guajá and Guajajara. People from this area use the Tupi language and have suffered consequences of European colonization and are under threat due to the pressure for exploration of the protected territory.
Geographic distribution. Hyphessobrycon caru sp. nov. has a restricted geographic distribution, being known only from the upper Pindaré river drainage, Mearim river basin, in the state of Maranhão, northeastern Brazil (Fig. 4). This species was never collected in the lower portions of this river drainage during 8 years of field trips conducted by EG and PB, including about 15 expeditions.     (Table 3). Hyphessobrycon caru sp. nov. is divergent on average 17.0% from the other taxa, with a minimum distance of 3.6% to H. piorskii and a maximum of 21.8% to Pristella maxillaris (Table 3).  (Fig. 6). This outcome was similar to the aforementioned results. The species included as outgroups (H. bentosi and H. copelandi) were also supported as independent lineages.

Discussion
Currently molecular techniques are frequently useful for solve species complexes and discover cryptic species (e.g. Bickford et al. 2006;Costa and Amorim 2011;Pereira et al. 2011;Adams et al. 2014;Costa-Silva 2015;Costa et al. 2012Costa et al. , 2014Costa et al. , 2017Amorim 2018;Guimarães et al. 2018;Ottoni et al. 2019) and could be an excellent complement for traditional taxonomy (Kekkonen and Hebert 2014). DNA barcoding has demonstrated to be very efficient for delimiting species of Hyphessobrycon, mainly in groups with little morphological variation (i.e., cryptic species) (see Castro-Paz et al. 2014;Guimarães et al. 2018), preferably when applied together with other species delimitation methods, such as PAA, DBC, CBB, bPTP, and WP in an integrative taxonomy perspective (Guimarães et al. 2018). The recognition of different genetic patterns and lineages in groups with very similar morphology has been a common pattern in the tree of eukaryotic life. This is observed particularly often in species-rich genera, such as in several Neotropical fishes (e.g. Pereira et al. 2011;Roxo et al. 2012;Castro-Paz et al. 2014;Melo et al. 2014Melo et al. , 2016aBenzaquem et al. 2015;Benine et al. 2015;Ottoni et al. 2019). DNA techniques can help to uncover morphological hidden diversity (Bickford et al. 2006;Adams et al. 2014), delimiting a putative population or group of populations as an independent lineage (species), and, subsequently, through a more meticulous analysis of morphological features, morphological differences between cryptic species can be found.
The large number of the described Hyphessobrycon species (about 160 spp.), with new species described every year, reveal an astonishing diversity within the genus. During the past 10 years, about 50 new species have been described ). However, historically Hyphessobrycon species have been described only on the basis of morphological features, including differences in the pigmentation patterns and teeth numbers and morphology, using few individuals per species (e.g . Steindachner 1882;Eigenmann 1915;Zarske 2008Zarske , 2014Bragança et al. 2015). Recently, DNA barcoding in characoid fishes has been used to discriminate species, identify new ones, and reveal that it is not always possible to differentiate species based solely on their morphology (Ornelas- Garcia et al. 2008;Pereira et al. 2011;Castro-Paz et al. 2014;Melo et al. 2014Melo et al. , 2016aBenine et al. 2015).
Our results suggest a cryptic speciation in the rosy tetra clade, more specifically in a new clade here defined, the Hyphessobrycon micropterus clade, including H. caru sp. nov., H. micropterus, H. piorskii, and H. simulatus, so far only known from the Pindaré, Itapecuru, Munim, Preguiças, and São Francisco river drainages of Brazil and the coastal river basins of French Guiana and Suriname (Guimarães et al. 2018;Brito et al. 2019;; this study). The clade proposed here is supported by high node support values (maximum posterior probability value and 99% of bootstrap value in BI and MP, respectively). In addition, this clade was corroborated by 20 synapomorphic nucleotide substitutions ( Fig. 5; Suppl. material 1: Box 1).  Hyphessobrycon caru sp. nov. is herein described within the Hyphessobrycon micropterus clade based on five different and independent methods of species delimitation (PAA, DBC, WP, CBB and bPTP), characterized by different criteria and assumptions. Hyphessobrycon caru sp. nov. is distinguished from all its congeners by a combination of unambiguous morphological character states [see Diagnosis (PAA)]. In our Bayesian phylogenetic analysis (Fig. 5A), haplotypes of H. caru sp. nov. formed a single exclusive clade with maximum posterior probability value (posterior probability = 1) (WP). Furthermore, the COI average genetic distance of H. caru sp. nov. when compared with the other taxa herein analyzed was 19.6% and its minimum COI genetic distance was 3.6% to H. piorskii (DBC). Considering this value, the threshold of H. caru sp. nov. would be greater than that inferred by delimitations among Neotropical fish species (2% according to Pereira et al. 2011). Moreover, H. caru sp. nov. was also molecularly diagnosed by six synaphomorphic nucleotide substitutions ( Fig. 5b; Suppl. material 1: Box 1), as well as, by a combination of other nucleotide substitutions (see CBB -molecular diagnosis), and corroborated by a bPTP analysis. Thus, it makes the hypothesis of this new species stronger from an integrative taxonomy perspective (see Dayrat 2005;de Queiroz 2007;Goldstein and Desalle 2010;Padial et al. 2010). Therefore, we recommend the use of integrative taxonomy for future taxonomic revisions and species descriptions when dealing with species complexes and groups containing possible cryptic species.