Genetic evidence for the recognition of two allopatric species of Asian bronze featherback Notopterus (Teleostei, Osteoglossomorpha, Notopteridae)

The fish genus Notopterus Lacepède, 1800 (Notopteridae) currently includes only one species, the Asian bronze featherback Notopterus notopterus (Pallas, 1769). This common freshwater species is widely distributed in the Oriental region, from the Indus basin in the west, the Mekong basin in the east and Java Island in the south. To examine the phylogeographic structure of N. notopterus across its range, we analysed 74 publicly available cytochrome oxidase I (COI) sequences, 72 of them determined from known-origin specimens, along with four newly-determined sequences from Peninsular Malaysian specimens. We found that N. notopterus is a complex of two allopatric species that diverge from each other by 7.5% mean p-distance. The first species is endemic to South Asia (from Indus basin to Ganga-Brahmaputra system), whereas the distribution of the second species is restricted to Southeast Asia. The exact limit between the distributions of these two species is not known, but it should fall somewhere between the Ganga-Brahmaputra and Salween basins, a region already identified as a major faunal boundary in the Oriental region. The name N. notopterus is retained for the Southeast Asian species, while the name Notopterus synurus (Bloch & Schneider, 1801) should be applied to the South Asian species. A comparative morphological study is needed to reveal the degree of morphological differentiation between the two species.


Introduction
The complex geological history of the Oriental region caused a high degree of geographical genetic structure within freshwater organisms and species, once considered widely distributed in this region, are often formed by distinct genetic lineages (e.g. de Bruyn et al. 2013;Dahruddin et al. 2017;Jamaluddin et al. 2019;Rüber et al. 2020). The bronze featherback fish Notopterus notopterus, currently the only valid species of the genus Notopterus, is one of such widely-distributed species, occurring from the Indus basin (Pakistan and India) in the west, to the Mekong region -slightly extending east to the Annamite Range -(Cambodia, Laos, Thailand and Vietnam) in the east and to Java (Indonesia) in the south. Intriguingly, N. notopterus has not yet been recorded in Borneo where it is most likely absent (Roberts 1989;Christensen 1992;Roberts 1992;Kottelat 1995;Kottelat and Widjanarti 2005;Parenti and Lim 2005).
Specimens of N. notopterus are identifiable from all other Oriental freshwater fishes based on (amongst other features) their distinct tapered tail and the corners of their mouth below eye (not behind as in the other Oriental notopterid genus Chitala Fowler, 1934) (Fig. 1A). It is reported that this species reaches up to 60 cm in standard length (Roberts 1992). A wealth of data is available for this species, documenting its reproductive behaviour and embryonic development (Yanwirsal et al. 2017), cytogenetics (Barby et al. 2019) and phylogeography (Gupta et al. 2013;Takagi et al. 2010). However, none of these studies simultaneously examined specimens sampled across the whole distributional range of N. notopterus. Inoue et al. (2009) reconstructed the molecular phylogeny of the family Notopteridae to discuss its evolution and biogeography. These authors sequenced the complete mitogenomes of two specimens of N. notopterus, one from India and the other from Thailand. The comparison of these two mitogenomic sequences revealed that these two specimens diverged from each other about 25 million years ago. This result was unexpected because no consistent morphological variation was previously reported within this species (see Roberts 1992).
We herein investigated the genetic diversity within the genus Notopterus across its full range. For that, we analysed a dataset comprising 72 publicly available sequences of the standard barcoding fragment (655 base pairs) of the cytochrome oxidase I (COI) gene that were determined from specimens of Notopterus with precise information on their geographical collection plus two COI sequences extracted from two complete mitogenomes from two specimens of Notopterus without precise localities (Inoue et al. 2009) and four sequences we newly determined from specimens collected in Peninsular Malaysia.

COI sequences mining and selection
Using the NCBI GenBank nucleotide database (https:// www.ncbi.nlm.nih.gov/), we searched for available cytochrome oxidase subunit I sequences of N. notopterus (search made on 02/12/2019) using the terms "Notopterus" and "oxidase." This search retrieved 88 mitochondrial entries (excluding three entries related to whole mitogenomes), from which we selected only those with either latitude-longitude coordinates or geographic localities sufficiently precise that we can confidently estimate their latitude-longitude coordinates. After this initial screening, the dataset included partial COI sequences of 77 specimens of Notopterus. We then checked the length and characteristics of these sequences using the software Mesquite version 3.31 (Maddison and Maddison 2017). Their quality was assessed in searching for stop codons, indels and/or the presence of relatively higher number of autapomorphic changes as indication of possible either pseudogenes or sequencing/editing errors. Five out of 77 sequences were found below standards and were excluded (all 16 excluded sequences and the respective reasons to exclude them are listed in Suppl. material 1: Table S1). After this screening, the dataset comprised 72 COI sequences of specimens collected in India (40 specimens), Bangladesh (two), Peninsular Malaysia (two), Indonesia (six), Myanmar (16) and Thailand (six) (localities are mapped in Fig. 1A and listed in the Suppl. material 1: Table S1). We added to this dataset two COI sequences of specimens (without precise locality) examined in Inoue et al. (2009) along with the four sequences we newly determined from specimens collected in Peninsular Malaysia, two from Kerian River (Penang State) and two from Bera lake (Pahang State). We deposited these voucher specimens in the ichthyological collection of the School of Biological Sciences, Universiti Sains Malaysia under accession numbers USMFC (3) 00002-3.

DNA extraction, amplification and sequencing
To determine the COI sequences of these specimens, we used the PCR technique to amplify them using the following PCR primer pair: forward FishF1 (5´TCAAC-CAACCACAAAGACATTGGCAC-3´) and reverse FishR1 (5´-TAGACTTCTGGGTGGCCAAAGAAT-CA-3´) (Ward et al. 2005). Reactions were carried out in 25 µl reaction volume containing 15.75 µl of sterile distilled H 2 O, 5.5 µl of 5× MyTaq Red reaction buffer (Bioline), 0.5 µl of each primer (10 µM), 0.25 µl of iTaq DNA polymerase (INtRON Biotechnology) and 2.5 µl of template containing approximately 5 ng DNA. The thermal cycle profile consisted of an initial 94 °C denaturation step for 4 min, 35 cycles of 94 °C for 30 sec, annealing for 50 sec at 47.9 °C, extension at 72 °C for 1 min, followed by a final extension at 72 °C for 7 min. PCR products were sent to First Base Sdn. Bhd. for sequencing analysis by the standard Sanger methodology. Chromatograms were edited with the Molecular Evolutionary Genetics Analysis X (MEGA X) (Stecher et al. 2020). Sequences are deposited in GenBank under accession numbers MT328860-3.

Comparative analysis
The alignment of the 78 COI nucleotide sequences was done by eye. Some sequences were shorter than others (at the 5' and 3' ends) and the overall proportion of missing data in the alignment was 4.6%. The alignment comprised 655 positions of which 62 were parsimony-informative. The alignment in Phylip format is provided as Suppl. material 2: ("Notopterus_align_78COI.phy"). We did not use an outgroup and we present only an unrooted haplotype network. Relationships amongst COI haplotypes were inferred with an unrooted network constructed with the programme PopArt (Leigh and Bryant 2015) using a median-joining algorithm (Bandelt et al. 1999) and default settings. Uncorrected pairwise genetic distances and numbers of nucleotide difference within and amongst groups along with their respective standard errors (SE) were calculated with MEGA X. SE were obtained by a bootstrap procedure (500 replicates).

Results
The haplotype network is shown Fig. 1B. Haplotypes segregate into two main groups having distinct distributions. The first group includes all haplotypes from India and Bangladesh (South Asia group). The second group includes all haplotypes from Peninsular Malaysia, Thailand (Mekong River), Indonesia (Sumatra and Java) and Myanmar (Lake Inle, Salween basin) (Southeast Asia group). These two main groups diverge by 7.5% p-genetic distance [min-max = 7.0-8.8%; SE = 1%] that represents, on average, 45 differences [SE = 5.8] between any combination of two specimens sampled from different groups.
In contrast, each of these two groups is genetically uniform with intra-group differentiation that does not exceed 1% (within the South Asia group, mean p-distance = 0.4%; min-max = 0.0-1.6%; SE = 0.1%; and within the Southeast Asia group, mean p-distance = 1%; min-max = 0.0-2.5%; SE = 0.2%). This represents six nucleotide differences [SE = 1.5] on average within the Southeast Asian group and only two nucleotide differences [SE = 0.6] on average within the South Asian group. Furthermore, the two specimens examined by Inoue et al. (2009) fell in their corresponding geographic origin groups.

Discussion
The minimal genetic distance separating the genus Notopterus into two main groups is well above 3% (using COI marker) which is considered as a conservative threshold between population and species levels in vertebrates (Ward et al. 2009). Furthermore, the existence of a so-called barcode gap (= intergroup distance/intragroup distance) of magnitude ~7X between these two groups, along with the fixation of more than 40 diagnostic nucleotide changes in COI, strongly indicate that Notopterus is formed by two species (Meyer and Paulay 2005;Meier et al. 2008). One species is distributed in South Asia (from the Indus basin to Ganga-Brahmaputra basin) and the other in Southeast Asia (from the Salween basin to Mekong basin plus Malay Peninsula, Sumatra and Java). Roberts (1992) examined the variation of several meristic characters of N. notopterus throughout its entire range. He did not report any significant intraspecific variability that could be additional evidence for the recognition of more than one species in the genus Notopterus. The low amount of morphological variability within the genus Notopterus, however, is not surprising given that the family Notopteridae is known for its morphological stasis. Several valid species of Notopteridae are morphologically similar. For example, species of the genus Chitala are only distinguishable based on their colour pattern (Roberts 1992) and more than one species is suspected to occur within Chitala lopis (Bleeker, 1851) (Kottelat and Widjanarti 2005; personal observation). Barby et al. (2019) and earlier cytogenetic studies cited in this work, reported the same karyotype formula in N. notopterus, regardless to their origins, with 2n = 42 and all chromosomes acrocentric.
Our COI-based results strongly support the presence of two allopatric species of Notopterus which need names. There are several nominal species of Notopterus which have been described from Southeast Asia and South Asia and several of these names are available. To determine which name should be applied to each of our two species, we examined the synonym lists of Fricke et al. (2020) and Kottelat (2013) and checked the date of description and type locality of these synonyms.
Notopterus notopterus was described and illustrated by Pallas in 1769 as Gymnotus notopterus from a specimen said to have been collected nearby Ambon [Ambon Island], Indian Ocean, a region where this species has never been recorded since. Kottelat (2013) and Fricke et al. (2020) suggested that the type locality given in the description of Pallas (1769) is an error because this species does not occur on the Island of Ambon which lies east of the Wallace Line, in a different biogeographical region. Furthermore, the local vernacular name of this species "Ikan Pangaio", reported by Pallas, is in the Malay language which was not used in Ambon at that time. For these reasons, Kottelat (2013) suggested that the specimen used by Pallas (1769) for the description of N. notopterus should have been collected in Java where the Dutch established their main colony in Indonesia. Consequently, the name N. notopterus should be retained for the species occurring in Southeast Asia with Java as its type locality. According to Fricke et al. (2020), there is no type specimen known.
Bloch and Schneider (1801) described Clupea synura Bloch & Schneider, 1801 from two syntypes, a name which was soon considered as a junior synonym of N. notopterus (e.g. as in Cuvier and Valenciennes 1847). Bloch and Schneider (1801) first indicated that these specimens are from the coast of Malabar, India ("Habitat ad oram Malabaricam" in Bloch and Schneider [1801]). However, in a following remark signed only by J.G. Schneider, China and Tranquebar (now known as Tharangambadi, India) were listed as the localities of the syntypes. Notopterus, however, does not seem to occur in China and Paepke (1999) found convincing explanations why Schneider could have mistaken China with the coast of Malabar, India (see Paepke 1999;Kottelat 2013;Fricke et al. 2020). According to Paepke (1999), the localities of the dry right skins of the two syntypes of Clupea synura that are housed in the Berlin Museum under catalogue numbers ZMB 8806 and ZMB 32057 are coast of Malabar and Tranquebar, respectively. We suggest the revalidation of Clupea synura for the species of Notopterus occurring in South Asia which should be recognised as Notopterus synurus.
Whereas the genetic evidence presented in this work supports the recognition of two valid living species of Notopterus, a detailed morphological comparison of the two species is lacking. Such morphological study is needed to identify possible diagnostic characters (in addition to the molecular diagnostic characters presented in this study) and to document the early diversification of the genus Notopterus in the Orient. In this respect, a fossil of Notopterus, morphological similar to living species, indicates that this genus was already present in Sumatra at least 33 million years ago (the Eocene-Oligocene boundary) (Sanders 1934). In addition, because of the cryptic diversity occurring in the genus Notopterus and the difficulty to identify the type localities of species of this genus, it will be important to designate a neotype for N. notopterus and a lectotype for N. synurus. Finally, the geographic coverage needs to be expanded with the study of specimens collected from the Mekong basin, Indus basin and, especially, from the region comprising the Irrawaddy basin situated between the Ganga-Brahmaputra river system and the Salween basin to determine the location of the exact distributional limit between these two species.