Plants’ miRNAs identification from deep-sequencing RNA-seq data using a multi-layer perceptron

Authors

  • Marco A. Juárez Verdayes Molecular Bioengineering and Bioinformatics Laboratory. Departamento de Ciencias BásicasUniversidad Autónoma Agraria Antonio Narro, Calzada Antonio Narro 1923, CP 25315. Buenavista, Saltillo Coahuila.
  • Javier Montalvo-Arredondo Molecular Bioengineering and Bioinformatics Laboratory. Departamento de Ciencias BásicasUniversidad Autónoma Agraria Antonio Narro, Calzada Antonio Narro 1923, CP 25315. Buenavista, Saltillo Coahuila.

DOI:

https://doi.org/10.59741/p05c6793

Keywords:

Fabaceae family, Artificial intelligence, microRNAs, Gene regulation in plants, Post-transcriptional regulation

Abstract

Micro-RNA (miRNA) – mediated transcript degradation is a layer of gene regulation at the post-transcriptional level that has important roles in plants. Some traits of plants that are of interest to the food industry are tightly regulated by this molecular mechanism. In México, Fabaceae – family plants represent one of the main food sources. Accordingly, it is important to study this layer of regulation to improve crop and seed production yields, nonetheless, one of the pressing concerns is the miRNAs loci identification. The basic and ancillary criteria, sometimes are not enough evidence for identifying miRNA loci. Artificial intelligence (AI), such as convolutional neural networks (CNN), have shown excellent predic-tive performance in identifying miRNAs loci, however, some of these CNN are complex and difficult to train and run. A multi-layer perceptron (MLP) model has been proposed for identifying pre-miRNAs sequences; it processes 180 feature information, however, the analysis is limited by the feature calculation, because it is computationally intensive. In this work, we proposed the use of AI based on a multi-layer perceptron (MLP) model which isn’t complex and easy to train, we also pro-posed the use of k-mer frequencies to extract information from nucleotide and secondary-structure representation sequences. We tested several features of MLP models such as activation functions between layers and the number of dropout layers. The best-fitted mo-dels showed 84-90% of sensitivity and 98 to 100% of specificity when they were evaluated with testing datasets. We tested the predictive performance of the best-fitted models on real deep RNA-seq data. In conclusion, in this paper, we present an MLP-based AI capable of identifying pre-miRNAs sequences from the Fabaceae family plants using deep RNA-Seq data, these AIs showed sensitivity values of 80-85% and specificity values of 90-95%.

Downloads

Download data is not yet available.

References

Andrews S. (2010). FastQC: A quality control tool for high throughput sequence data. Available at: https://www.bioinformatics.babraham.ac.uk/projects/fastqc.

Bolger, A. M., Lohse, M., & Usadel, B. (2014). Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics, 30, 2114–2120. DOI: https://doi.org/10.1093/bioinformatics/btu170. DOI: https://doi.org/10.1093/bioinformatics/btu170

Cha, M., Zheng, H., Braham, C., Li, X., & Hu, H. (2021). A two-stream convolutional neural network for microRNA transcription start site feature integration and identification. Scientific Reports, 11, 5625. DOI: https://doi.org/10.1038/s41598-021-85173-x. DOI: https://doi.org/10.1038/s41598-021-85173-x

Centeno-González, N. K., Martínez-Cabrera, H. I., Porras-Múzquiz, H., & Estrada-Ruiz, E. (2021). Late Campanian fossil of a legume fruit supports Mexico as a center of Fabaceae radiation. Communications Biology, 4, 41. DOI: https://doi.org/10.1038/s42003-020-01533-9. DOI: https://doi.org/10.1038/s42003-020-01533-9

Dong, Q., Hu, B., & Zhang, C. (2022). microRNAs and their roles in plant development. Frontiers in Plant Science, 13, 824240. DOI: https://doi.org/10.3389/fpls.2022.824240. DOI: https://doi.org/10.3389/fpls.2022.824240

Gangadhar, B. H., Venkidasamy, B., Samynathan, R., Saranya, B., Chung, I.-M., & Thiruvengadam, M. (2021). Overview of miRNA biogenesis and applications in plants. Biologia, 76, 2309–2327. DOI: https://doi.org/10.1007/s11756-021-00763-4. DOI: https://doi.org/10.1007/s11756-021-00763-4

Guo, Z., Kuang, Z., Wang, Y., Zhao, Y., Tao, Y., Cheng, C., . . . others. (2020). PmiREN: a comprehensive encyclopedia of plant miRNAs. Nucleic Acids Research, 48, D1114–D1121. DOI: https://doi.org/10.1093/nar/gkz894. DOI: https://doi.org/10.1093/nar/gkz894

Henderi, Wahyuningsih, T. & Rahwanto, E. (2021). Comparison of min-max normalization and z-score normalization in the K-nearest neighbor (kNN) algorithm to test the accuracy of types of breast cancer. 4(1), 13-20. DOI: https://doi.org/10.47738/ijiis.v4i1.73. DOI: https://doi.org/10.47738/ijiis.v4i1.73

Kim, D., Paggi, J. M., Park, C., Bennett, C., & Salzberg, S. L. (2019). Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nature Biotechnology, 37, 907–915. DOI: https://doi.org/10.1038/s41587-019-0201-4. DOI: https://doi.org/10.1038/s41587-019-0201-4

Kozomara, A., & Griffiths-Jones, S. (2010). miRBase: integrating microRNA annotation and deep-sequencing data. Nucleic Acids Research, 39, D152–D157. DOI: https://doi.org/10.1093/nar/gkq1027. DOI: https://doi.org/10.1093/nar/gkq1027

Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2017). ImageNet classification with deep convolutional neural networks. Communications of the ACM, 60(6), 84 – 90. DOI: https://dl.acm.org/doi/10.5555/2999134.2999257. DOI: https://doi.org/10.1145/3065386

Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., . . . Subgroup, 1. G. (2009). The sequence alignment/map format and SAMtools. Bioinformatics, 25, 2078–2079. DOI: https://doi.org/10.1093%2Fbioinformatics%2Fbtp352. DOI: https://doi.org/10.1093/bioinformatics/btp352

Lokuge, S., Jayasundara, S., Ihalagedara, P., Kahanda, I., & Herath, D. (2022). miRNAFinder: A comprehensive web resource for plant pre-microRNA classification. Biosystems, 215, 104662. DOI: https://doi.org/10.1016/j.biosystems.2022.104662. DOI: https://doi.org/10.1016/j.biosystems.2022.104662

Lorenz, R., Bernhart, S. H., Höner zu Siederdissen, C., Tafer, H., Flamm, C., Stadler, P. F., & Hofacker, I. L. (2011). ViennaRNA Package 2.0. Algorithms for Molecular Biology, 6, 1–14. DOI: https://doi.org/10.1186/1748-7188-6-26. DOI: https://doi.org/10.1186/1748-7188-6-26

Meyers, B. C., Axtell, M. J., Bartel, B., Bartel, D. P., Baulcombe, D., Bowman, J. L., . . . others. (2008). Criteria for annotation of plant MicroRNAs. The Plant Cell, 20, 3186–3190. DOI: https://doi.org/10.1105/tpc.108.064311. DOI: https://doi.org/10.1105/tpc.108.064311

Owusu Adjei, M., Zhou, X., Mao, M., Rafique, F., & Ma, J. (2021). MicroRNAs roles in plants secondary metabolism. Plant Signaling & Behavior, 16, 1915590. DOI: https://doi.org/10.1080/15592324.2021.1915590. DOI: https://doi.org/10.1080/15592324.2021.1915590

Pertea, G., & Pertea, M. (2020). GFF utilities: GffRead and GffCompare. F1000Research, 9. DOI: https://doi.org/10.12688/f1000research.23297.2. DOI: https://doi.org/10.12688/f1000research.23297.2

Rojo-Arias, J. E., & Busskamp, V. (2019). Challenges in microRNAs’ targetome prediction and validation. Neural Regeneration Research, 14, 1672–1677. DOI: https://doi.org/10.4103%2F1673-5374.257514. DOI: https://doi.org/10.4103/1673-5374.257514

Shavanov, M. V. (2021). The role of food crops within the Poaceae and Fabaceae families as nutritional plants. IOP Conference Series: Earth and Environmental Science, 624, p. 012111. DOI: https://doi.org/10.1088/1755-1315/624/1/012111. DOI: https://doi.org/10.1088/1755-1315/624/1/012111

Song, X., Li, Y., Cao, X., & Qi, Y. (2019). MicroRNAs and their regulatory roles in plant–environment interactions. Annual Review of Plant Biology, 70, 489–525. DOI: https://doi.org/10.1146/annurev-arplant-050718-100334. DOI: https://doi.org/10.1146/annurev-arplant-050718-100334

Tiwari, R., & Rajam, M. V. (2022). RNA-and miRNA-interference to enhance abiotic stress tolerance in plants. Journal of Plant Biochemistry and Biotechnology, 31, 689–704. DOI: https://doi.org/10.1007/s13562-022-00770-9. DOI: https://doi.org/10.1007/s13562-022-00770-9

Trevethan, R. (2017). Sensitivity, specificity, and predictive values: foundations, pliabilities, and pitfalls in research and practice. Frontiers in Public Health, 5, 307. DOI: https://doi.org/10.3389/fpubh.2017.00307. DOI: https://doi.org/10.3389/fpubh.2017.00307

Wani, S. H., Kumar, V., Khare, T., Tripathi, P., Shah, T., Ramakrishna, C., . . . Mangrauthia, S. K. (2020). miRNA applications for engineering abiotic stress tolerance in plants. Biologia, 75, 1063–1081. DOI: http://dx.doi.org/10.2478/s11756-019-00397-7. DOI: https://doi.org/10.2478/s11756-019-00397-7

Yang, X., Zhang, L., Yang, Y., Schmid, M., & Wang, Y. (2021). miRNA mediated regulation and interaction between plants and pathogens. International Journal of Molecular Sciences, 22, 2913. DOI: https://doi.org/10.3390%2Fijms22062913. DOI: https://doi.org/10.3390/ijms22062913

Zhang, F., Yang, J., Zhang, N., Wu, J., & Si, H. (2022). Roles of microRNAs in abiotic stress response and characteristics regulation of plant. Frontiers in Plant Science, 13, 919243. DOI: https://doi.org/10.3389%2Ffpls.2022.919243. DOI: https://doi.org/10.3389/fpls.2022.919243

Zhang, L., Xiang, Y., Chen, S., Shi, M., Jiang, X., He, Z., & Gao, S. (2022). Mechanisms of microRNA biogenesis and stability control in Plants. Frontiers in Plant Science. 13, 844149. DOI: https://doi.org/10.3389/fpls.2022.844149. DOI: https://doi.org/10.3389/fpls.2022.844149

Zhang, Y., Huang, J., Xie, F., Huang, Q., Jiao, H., & Cheng, W. (2024). Identification of plant microRNAs using convolutional neural network. Frontiers in Plant Science, 15, 1330854. DOI: https://doi.org/10.3389/fpls.2024.1330854. DOI: https://doi.org/10.3389/fpls.2024.1330854

Zhang, Z., Teotia, S., Tang, J., & Tang, G. (2019). Perspectives on microRNAs and phased small interfering RNAs in maize (Zea mays L.): functions and big impact on agronomic traits enhancement. Plants, 8, 170. DOI: https://doi.org/10.3390/plants8060170. DOI: https://doi.org/10.3390/plants8060170

Zhao, Y., Wang, G., Tang, C., Luo, C., Zeng, W., & Zha, Z.-J. (2021). A battle of network structures: An empirical study of CNN, Transformer, and MLP. arXiv preprint arXiv:2108.13002. DOI: https://doi.org/10.48550/arXiv.2108.13002.

Zheng, X., Xu, S., Zhang, Y., & Huang, X. (2019). Nucleotide-level convolutional neural networks for pre-miRNA classification. Scientific Reports, 9, 628. DOI: https://doi.org/10.1038/s41598-018-36946-4 DOI: https://doi.org/10.1038/s41598-018-36946-4

Zielezinski, A., Vinga, S., Almeida, J. and Karlowski, W.M., 2017. Alignment-free sequence comparison: benefits, applications, and tools. Genome biology, 18, pp.1-17. DOI: https://doi.org/10.1186/s13059-017-1319-7. DOI: https://doi.org/10.1186/s13059-017-1319-7

Downloads

Published

2025-03-19

How to Cite

Plants’ miRNAs identification from deep-sequencing RNA-seq data using a multi-layer perceptron. (2025). Universitas Agri, 4(1), 5-20. https://doi.org/10.59741/p05c6793

PLUM Metrics