Plants’ miRNAs identification from deep-sequencing RNA-seq data using a multi-layer perceptron
DOI:
https://doi.org/10.59741/p05c6793Keywords:
Fabaceae family, Artificial intelligence, microRNAs, Gene regulation in plants, Post-transcriptional regulationAbstract
Micro-RNA (miRNA) – mediated transcript degradation is a layer of gene regulation at the post-transcriptional level that has important roles in plants. Some traits of plants that are of interest to the food industry are tightly regulated by this molecular mechanism. In México, Fabaceae – family plants represent one of the main food sources. Accordingly, it is important to study this layer of regulation to improve crop and seed production yields, nonetheless, one of the pressing concerns is the miRNAs loci identification. The basic and ancillary criteria, sometimes are not enough evidence for identifying miRNA loci. Artificial intelligence (AI), such as convolutional neural networks (CNN), have shown excellent predic-tive performance in identifying miRNAs loci, however, some of these CNN are complex and difficult to train and run. A multi-layer perceptron (MLP) model has been proposed for identifying pre-miRNAs sequences; it processes 180 feature information, however, the analysis is limited by the feature calculation, because it is computationally intensive. In this work, we proposed the use of AI based on a multi-layer perceptron (MLP) model which isn’t complex and easy to train, we also pro-posed the use of k-mer frequencies to extract information from nucleotide and secondary-structure representation sequences. We tested several features of MLP models such as activation functions between layers and the number of dropout layers. The best-fitted mo-dels showed 84-90% of sensitivity and 98 to 100% of specificity when they were evaluated with testing datasets. We tested the predictive performance of the best-fitted models on real deep RNA-seq data. In conclusion, in this paper, we present an MLP-based AI capable of identifying pre-miRNAs sequences from the Fabaceae family plants using deep RNA-Seq data, these AIs showed sensitivity values of 80-85% and specificity values of 90-95%.
Downloads
References
Andrews S. (2010). FastQC: A quality control tool for high throughput sequence data. Available at: https://www.bioinformatics.babraham.ac.uk/projects/fastqc.
Bolger, A. M., Lohse, M., & Usadel, B. (2014). Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics, 30, 2114–2120. DOI: https://doi.org/10.1093/bioinformatics/btu170. DOI: https://doi.org/10.1093/bioinformatics/btu170
Cha, M., Zheng, H., Braham, C., Li, X., & Hu, H. (2021). A two-stream convolutional neural network for microRNA transcription start site feature integration and identification. Scientific Reports, 11, 5625. DOI: https://doi.org/10.1038/s41598-021-85173-x. DOI: https://doi.org/10.1038/s41598-021-85173-x
Centeno-González, N. K., Martínez-Cabrera, H. I., Porras-Múzquiz, H., & Estrada-Ruiz, E. (2021). Late Campanian fossil of a legume fruit supports Mexico as a center of Fabaceae radiation. Communications Biology, 4, 41. DOI: https://doi.org/10.1038/s42003-020-01533-9. DOI: https://doi.org/10.1038/s42003-020-01533-9
Dong, Q., Hu, B., & Zhang, C. (2022). microRNAs and their roles in plant development. Frontiers in Plant Science, 13, 824240. DOI: https://doi.org/10.3389/fpls.2022.824240. DOI: https://doi.org/10.3389/fpls.2022.824240
Gangadhar, B. H., Venkidasamy, B., Samynathan, R., Saranya, B., Chung, I.-M., & Thiruvengadam, M. (2021). Overview of miRNA biogenesis and applications in plants. Biologia, 76, 2309–2327. DOI: https://doi.org/10.1007/s11756-021-00763-4. DOI: https://doi.org/10.1007/s11756-021-00763-4
Guo, Z., Kuang, Z., Wang, Y., Zhao, Y., Tao, Y., Cheng, C., . . . others. (2020). PmiREN: a comprehensive encyclopedia of plant miRNAs. Nucleic Acids Research, 48, D1114–D1121. DOI: https://doi.org/10.1093/nar/gkz894. DOI: https://doi.org/10.1093/nar/gkz894
Henderi, Wahyuningsih, T. & Rahwanto, E. (2021). Comparison of min-max normalization and z-score normalization in the K-nearest neighbor (kNN) algorithm to test the accuracy of types of breast cancer. 4(1), 13-20. DOI: https://doi.org/10.47738/ijiis.v4i1.73. DOI: https://doi.org/10.47738/ijiis.v4i1.73
Kim, D., Paggi, J. M., Park, C., Bennett, C., & Salzberg, S. L. (2019). Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nature Biotechnology, 37, 907–915. DOI: https://doi.org/10.1038/s41587-019-0201-4. DOI: https://doi.org/10.1038/s41587-019-0201-4
Kozomara, A., & Griffiths-Jones, S. (2010). miRBase: integrating microRNA annotation and deep-sequencing data. Nucleic Acids Research, 39, D152–D157. DOI: https://doi.org/10.1093/nar/gkq1027. DOI: https://doi.org/10.1093/nar/gkq1027
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2017). ImageNet classification with deep convolutional neural networks. Communications of the ACM, 60(6), 84 – 90. DOI: https://dl.acm.org/doi/10.5555/2999134.2999257. DOI: https://doi.org/10.1145/3065386
Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., . . . Subgroup, 1. G. (2009). The sequence alignment/map format and SAMtools. Bioinformatics, 25, 2078–2079. DOI: https://doi.org/10.1093%2Fbioinformatics%2Fbtp352. DOI: https://doi.org/10.1093/bioinformatics/btp352
Lokuge, S., Jayasundara, S., Ihalagedara, P., Kahanda, I., & Herath, D. (2022). miRNAFinder: A comprehensive web resource for plant pre-microRNA classification. Biosystems, 215, 104662. DOI: https://doi.org/10.1016/j.biosystems.2022.104662. DOI: https://doi.org/10.1016/j.biosystems.2022.104662
Lorenz, R., Bernhart, S. H., Höner zu Siederdissen, C., Tafer, H., Flamm, C., Stadler, P. F., & Hofacker, I. L. (2011). ViennaRNA Package 2.0. Algorithms for Molecular Biology, 6, 1–14. DOI: https://doi.org/10.1186/1748-7188-6-26. DOI: https://doi.org/10.1186/1748-7188-6-26
Meyers, B. C., Axtell, M. J., Bartel, B., Bartel, D. P., Baulcombe, D., Bowman, J. L., . . . others. (2008). Criteria for annotation of plant MicroRNAs. The Plant Cell, 20, 3186–3190. DOI: https://doi.org/10.1105/tpc.108.064311. DOI: https://doi.org/10.1105/tpc.108.064311
Owusu Adjei, M., Zhou, X., Mao, M., Rafique, F., & Ma, J. (2021). MicroRNAs roles in plants secondary metabolism. Plant Signaling & Behavior, 16, 1915590. DOI: https://doi.org/10.1080/15592324.2021.1915590. DOI: https://doi.org/10.1080/15592324.2021.1915590
Pertea, G., & Pertea, M. (2020). GFF utilities: GffRead and GffCompare. F1000Research, 9. DOI: https://doi.org/10.12688/f1000research.23297.2. DOI: https://doi.org/10.12688/f1000research.23297.2
Rojo-Arias, J. E., & Busskamp, V. (2019). Challenges in microRNAs’ targetome prediction and validation. Neural Regeneration Research, 14, 1672–1677. DOI: https://doi.org/10.4103%2F1673-5374.257514. DOI: https://doi.org/10.4103/1673-5374.257514
Shavanov, M. V. (2021). The role of food crops within the Poaceae and Fabaceae families as nutritional plants. IOP Conference Series: Earth and Environmental Science, 624, p. 012111. DOI: https://doi.org/10.1088/1755-1315/624/1/012111. DOI: https://doi.org/10.1088/1755-1315/624/1/012111
Song, X., Li, Y., Cao, X., & Qi, Y. (2019). MicroRNAs and their regulatory roles in plant–environment interactions. Annual Review of Plant Biology, 70, 489–525. DOI: https://doi.org/10.1146/annurev-arplant-050718-100334. DOI: https://doi.org/10.1146/annurev-arplant-050718-100334
Tiwari, R., & Rajam, M. V. (2022). RNA-and miRNA-interference to enhance abiotic stress tolerance in plants. Journal of Plant Biochemistry and Biotechnology, 31, 689–704. DOI: https://doi.org/10.1007/s13562-022-00770-9. DOI: https://doi.org/10.1007/s13562-022-00770-9
Trevethan, R. (2017). Sensitivity, specificity, and predictive values: foundations, pliabilities, and pitfalls in research and practice. Frontiers in Public Health, 5, 307. DOI: https://doi.org/10.3389/fpubh.2017.00307. DOI: https://doi.org/10.3389/fpubh.2017.00307
Wani, S. H., Kumar, V., Khare, T., Tripathi, P., Shah, T., Ramakrishna, C., . . . Mangrauthia, S. K. (2020). miRNA applications for engineering abiotic stress tolerance in plants. Biologia, 75, 1063–1081. DOI: http://dx.doi.org/10.2478/s11756-019-00397-7. DOI: https://doi.org/10.2478/s11756-019-00397-7
Yang, X., Zhang, L., Yang, Y., Schmid, M., & Wang, Y. (2021). miRNA mediated regulation and interaction between plants and pathogens. International Journal of Molecular Sciences, 22, 2913. DOI: https://doi.org/10.3390%2Fijms22062913. DOI: https://doi.org/10.3390/ijms22062913
Zhang, F., Yang, J., Zhang, N., Wu, J., & Si, H. (2022). Roles of microRNAs in abiotic stress response and characteristics regulation of plant. Frontiers in Plant Science, 13, 919243. DOI: https://doi.org/10.3389%2Ffpls.2022.919243. DOI: https://doi.org/10.3389/fpls.2022.919243
Zhang, L., Xiang, Y., Chen, S., Shi, M., Jiang, X., He, Z., & Gao, S. (2022). Mechanisms of microRNA biogenesis and stability control in Plants. Frontiers in Plant Science. 13, 844149. DOI: https://doi.org/10.3389/fpls.2022.844149. DOI: https://doi.org/10.3389/fpls.2022.844149
Zhang, Y., Huang, J., Xie, F., Huang, Q., Jiao, H., & Cheng, W. (2024). Identification of plant microRNAs using convolutional neural network. Frontiers in Plant Science, 15, 1330854. DOI: https://doi.org/10.3389/fpls.2024.1330854. DOI: https://doi.org/10.3389/fpls.2024.1330854
Zhang, Z., Teotia, S., Tang, J., & Tang, G. (2019). Perspectives on microRNAs and phased small interfering RNAs in maize (Zea mays L.): functions and big impact on agronomic traits enhancement. Plants, 8, 170. DOI: https://doi.org/10.3390/plants8060170. DOI: https://doi.org/10.3390/plants8060170
Zhao, Y., Wang, G., Tang, C., Luo, C., Zeng, W., & Zha, Z.-J. (2021). A battle of network structures: An empirical study of CNN, Transformer, and MLP. arXiv preprint arXiv:2108.13002. DOI: https://doi.org/10.48550/arXiv.2108.13002.
Zheng, X., Xu, S., Zhang, Y., & Huang, X. (2019). Nucleotide-level convolutional neural networks for pre-miRNA classification. Scientific Reports, 9, 628. DOI: https://doi.org/10.1038/s41598-018-36946-4 DOI: https://doi.org/10.1038/s41598-018-36946-4
Zielezinski, A., Vinga, S., Almeida, J. and Karlowski, W.M., 2017. Alignment-free sequence comparison: benefits, applications, and tools. Genome biology, 18, pp.1-17. DOI: https://doi.org/10.1186/s13059-017-1319-7. DOI: https://doi.org/10.1186/s13059-017-1319-7
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Marco A. Juárez Verdayes, Javier Montalvo-Arredondo

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
How to Cite
PLUM Metrics