Deciphering conserved identical sequences of mature miRNAs among six members of great apes

MicroRNAs (miRNAs) are a group of small RNA molecules which act as negative regulators of gene expression by controlling post-transcriptional regulation through binding to their corresponding mRNAs. Due to their small size, their nucleotide compositions are expected to be similar, but until now, the extent of similarity has not been reported in humans and their six phylogenetically closely related members of hominids. The present study allows direct comparison among six members of hominid species (Homo sapiens, Gorilla gorilla, Pan paniscus, Pongo pygmaeus, Pan troglodytes and Symphalangus syndactylus) in terms of their miRNA repertoire, their evolutionary distance to human, as well as, the categorization of identical species-specific miRNAs. For this purpose, a total of 2694, 370, 157, 673, 590 and 10 mature miRNA sequences of Homo sapiens, Gorilla gorilla, Pan paniscus, Pongo pygmaeus, Pan troglodytes and Symphalangus syndactylus respectively were retrieved from miRbase 22. A total of 12, 4, 4 and 3 conserved clusters with identical miRNA sequences that belong to the same gene families were found in Homo sapiens, Gorilla gorilla, Pongo pygmaeus, Pan troglodytes respectively by neighbor-joining method using MEGA7 software. Interestingly, cross-species comparison has also shown a set of conserved identical miRNA sequences. Homologs of human mature miRNAs with 100% sequence identity are expected to have similar functions in the studied primates. Further in-vitro study is required to investigate common targets for identical miRNAs in the studied primates.

It is already known that multiple miRNAs are produced from the same primary transcript and majority of miRNA clusters are transcribed as a single unit (Marco et al. 2013).The evolutionary importance of miRNA clusters has been the subject of much speculation (Wang et al. 2016).Many clusters contain members of the same family, suggesting an important role of gene duplication in their evolution (Berezikov 2011).On the other hand, some miRNA clusters also contain members of different miR-NA families, particularly in animal kingdom (McCreight et al. 2017).Like other gene families, miRNAs are also prone to forming paralogs, with the result that many miR-NAs appear as members of families as homologs (Hertel et al. 2006).However, the origin and evolution of these miRNA clusters has not been investigated in detail (Altuvia et al. 2005;Tanzer and Stadler 2004).Phylogenetic studies have shown that miRNAs are present throughout the evolution of metazoans.Comparison of pre-miRNA sequences demonstrate that they are less conserved and therefore are more prone to phylogenetically preserved than the mature sequences alone.High degree of identity across different species was observed for mature miR-NAs (Li et al. 2010).It is also noted that many matured miRNAs are prevailing in several species and are highly conserved and are confined to specific lineages.There are several polycistronic transcripts that suggest a potential mode of evolution for polycistronic miRNAs (Truscott et al. 2016).It is already known that the miRNA repertoire has continuously increased during evolution of metazoan.However, the advent ratio of these molecules is diverse over evolutionary time (Bartel 2018).The expansions of miRNA have been linked with evolutionary innovations that lead to the diversification of bilaterians.Till now, identification of orthologous miRNAs in different species has been investigated in primates.
Among the six members of great apes, Homo sapiens are the deepest explored group with 2694 mature miR-NAs miRNAs described.In the present study, we took advantage of a recently available set of mature miRNA from six members of the great ape population to systematically detect identical miRNA by comparing patterns of intra-and inter-species sequence similarity and their evolutionary distance.Interestingly, it was found that intra-and inter-species sequence set of identical mature miRNA exists in great apes including humans.Further in-vitro study is required to investigate common targets for identical miRNAs in the studied primates.

Alignment of sequences and phylogenetic analysis
For miRNA, a very limited open and free data is available.The miRBase is one of the highly referred databases, easily accessible and in its latest release 10883 pre-miRNAs are available.Dataset of mature miRNAs sequences of Homo sapiens (no=2694), Gorilla gorilla (no=370), Pan paniscus (no=157), Pongo pygmaeus (no=673 mature), Pan troglodytes (no=590 mature) and Symphalangus syndactylus (no=10 mature) were retrieved from miRBase sequence database (a data repository of published miRNA sequences and its annotation) (release 22.0) at http://microrna.sanger.ac.uk.ClustalW was used to generate multiple alignments of nucleic acid sequences (Chenna et al. 2003) and MEGA7 was used to generate phylogenetic analyses using Neighbor-Joining method (Kumar et al. 2016).

Identification of Homologous mature miRNA sequences in intra-species in hominides
Homologous sequences in Homo sapiens were clustered based on their phylogenetic relationship and sequence identity using ClustalW.Multiple alignment of mature miRNAs revealed a conserved consensus.Neighbor-Join- ing method was used for inferring the evolutionary history.The optimal tree with the sum of branch length = 2.32453654 is shown in Figure 1.The p-distance method was used for computing the evolutionary distances.The analysis involved 29 nucleotide sequences.All uncertain positions were deleted for each sequence pair.In the final dataset, there was a total of 36 positions.Evolutionary analyses were conducted in MEGA7.The conserved miR-NA sequences were grouped into 12 clusters.Number of miRNA members in each cluster, showing 100% identity in their sequences and their corresponding genomic coordinates, are shown in Table 1.It is interesting to note that the genomic location of all the identical miRNAs are in clustered form.This indicates that the genes for these identical miRNAs are originated through gene duplication during evolution and speciation.Identical miRNA sequences were also noted in Gorilla gorilla using the same phylogenetic model.The optimal tree with the sum of branch length = 0.40606061 is shown in Figure 2. The analysis involved 12 nucleotide sequences.There were a total of 35 positions in the final dataset.However, only 12 miRNAs were found to be 100% identical.These 12 miRNAs are grouped into four clusters.Number of miRNA members in each cluster showing 100% identity in their sequences and their corresponding genomic coordinates are shown in Table 2. Interestingly, all the identified conserved miRNAs belong to a single gene family (MIPF0000020; mir-515).Similarly, identical miRNA sequences were also noted in Pongo pygmaeus using the same phylogenetic model as shown in Figure 3.The total number of identical miRNAs were 8 that were grouped into 4 clusters as shown in Table 3.The optimal tree with the sum of branch length = 1.37798461 is shown in Figure 3.The analysis involved 8 nucleotide sequences.There were a total of 32 positions in the final dataset.Likewise, identical miRNA sequences were also noted in Pan troglodytes using the same phylogenetic model.Seven conserved miRNAs were also found in Pan troglodytes that were grouped into three clusters as shown in Table 4.The optimal tree with the sum of branch length = 0.71717172 is shown in Figure 4.The analysis involved 7 nucleotide sequences.There were a total of 26 positions in the final dataset.

Identification of Homologous mature miRNA Sequences in inter-species in Hominides
To assess whether any cross-species conserved miRNA in hominids exists, all known matured miRNAs were aligned to generate multiple alignments of nucleic acid sequences using ClustalW, and MEGA7 was used to generate phylogenetic analyses.The optimal tree with the sum of branch length = 3.16412289 is shown in Figure 4.There were a total of 31 positions in the final dataset.

Discussion
MiRNA-mediated gene regulation is novel mechanism among all lineages in animal kingdom (Zhang et al. 2004).Due to their smaller size, many known miRNA genes in animal genomes are found as clusters.MiRNA clusters are a group of related miRNAs closely localized in the genome with an evolution that remains poorly understood (Chen et al. 2015).Therefore, most of the clusters are transcribed as a single polycistronic transcripts (Lagos-Quintana et     al . 2003;Mourelatos et al. 2002).It was found that these clusters are highly conserved in most mammals.Insertions of new miRNAs, deletions of individual miRNAs, and a cluster duplication observed in different species suggest an actively evolving cluster.In the present study, intra-species and inter-species conserved identical miRNAs were identified in the six species of hominoids.Interestingly, there were few miRNAs that were conserved across all studied species, indicating their evolutionary distance to humans, as well as, the categorization of identical species-specific miRNAs were identified in six members of great apes.It was investigated that most conserved miRNA clusters in all the studied members of hominoids belong to the two families i.e., mir-515 and mir-199, suggesting that the ancestral clusters may be originated by tandem duplication.
It has been demonstrated that some miRNA genes exhibited the phenomena of clustering (Kurkewich et al. 2018).
Interestingly, several studies have shown that miRNA clusters comprise of two or more miRNA genes that display high level of identity in sequences.Moreover, they are situated contiguously with each other in the genome (Gonzalez-Vallinas et al. 2018).Through experimental and computational identification, the miRBase database is one of the primary repository resources for collecting miRNA genes (Griffiths-Jones et al. 2007;Kozomara and Griffiths-Jones 2013).No futuristic data related to miR-NA clusters is available in miRBase.Similarly, no further information about miRNA gene clusters was rendered to explore the evolutionary conservation between miRNA clusters across several species.The present data highlighted intra and inter phylogenetic relationship of matured miRNA six species of great apes.

Conclusion
In this comparative study, conserved identical miRNA sequences were found among four hominid species.The applied prediction algorithm (mentioned in the materials and method section) proves several criteria based on similarity to identified conserved sequences of miRNAs to detect both more distantly-related and closely-related homologs.Further study is required to identify potential targets for identical miRNAs in the studied primates.
Compliance with ethical standards

Figure 1 .
Figure 1.Shows evolutionary relationships of taxa for all mature miRNAs in Homo sapiens.The analysis involved 29 nucleotide sequences.

Figure 2 .
Figure 2. Represents evolutionary relationships of taxa for Gorilla gorilla.The optimal tree with the sum of branch length = 0.40606061 is shown.The analysis involved 12 nucleotide sequences.

Figure 3 .
Figure 3. Illustrate evolutionary relationships of taxa for Pongo pygmaeus.The analysis involved 8 nucleotide sequences.

Figure 4 .
Figure 4. Represents evolutionary relationships of taxa for Pan troglodytes.The optimal tree with the sum of branch length = 0.71717172 is shown.The analysis involved 7 nucleotide sequences.

Figure 5 .
Figure 5.A comparative evolutionary relationships of taxa for Homo sapiens, Gorilla gorilla, Pongo pygmaeus and Pan troglodytes.The analysis involved 54 nucleotide sequences.

Table 1 .
List of miRNAs grouped into clusters, their genomic coordinates, gene family names and their matured miRNA sequences in Homo sapiens.

Table 2 .
List of miRNAs grouped into clusters, their genomic coordinates, gene family names and their matured miRNA sequences in Gorilla gorilla.S.NoNo.

Table 3 .
List of miRNAs grouped into clusters, their genomic coordinates, gene family names and their matured miRNA sequences in Pongo pygmaeus.