pmiRScan: a LightGBM based method for prediction of animal pre-miRNAs

IF 3.9 4区 生物学 Q1 GENETICS & HEREDITY
Amrit Venkatesan, Jolly Basak, Ranjit Prasad Bahadur
{"title":"pmiRScan: a LightGBM based method for prediction of animal pre-miRNAs","authors":"Amrit Venkatesan,&nbsp;Jolly Basak,&nbsp;Ranjit Prasad Bahadur","doi":"10.1007/s10142-025-01527-y","DOIUrl":null,"url":null,"abstract":"<div><p>MicroRNAs (miRNA) are categorized as short endogenous non-coding RNAs, which have a significant role in post-transcriptional gene regulation. Identifying new animal precursor miRNA (pre-miRNA) and miRNA is crucial to understand the role of miRNAs in various biological processes including the development of diseases. The present study focuses on the development of a Light Gradient Boost (LGB) based method for the classification of animal pre-miRNAs using various sequence and secondary structural features. In various pre-miRNA families, distinct k-mer repeat signatures with a length of three nucleotides have been identified. Out of nine different classifiers that have been trained and tested in the present study, LGB has an overall better performance with an AUROC of 0.959. In comparison with the existing methods, our method ‘pmiRScan’ has an overall better performance with accuracy of 0.93, sensitivity of 0.86, specificity of 0.95 and F-score of 0.82. Moreover, pmiRScan effectively classifies pre-miRNAs from four distinct taxonomic groups: mammals, nematodes, molluscs and arthropods. We have used our classifier to predict genome-wide pre-miRNAs in human. We find a total of 313 pre-miRNA candidates using pmiRScan. A total of 180 potential mature miRNAs belonging to 60 distinct miRNA families are extracted from predicted pre-miRNAs; of which 128 were novel and are note reported in miRBase. These discoveries may enhance our current understanding of miRNAs and their targets in human. pmiRScan is freely available at http://www.csb.iitkgp.ac.in/applications/pmiRScan/index.php.</p></div>","PeriodicalId":574,"journal":{"name":"Functional & Integrative Genomics","volume":"25 1","pages":""},"PeriodicalIF":3.9000,"publicationDate":"2025-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Functional & Integrative Genomics","FirstCategoryId":"99","ListUrlMain":"https://link.springer.com/article/10.1007/s10142-025-01527-y","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0

Abstract

MicroRNAs (miRNA) are categorized as short endogenous non-coding RNAs, which have a significant role in post-transcriptional gene regulation. Identifying new animal precursor miRNA (pre-miRNA) and miRNA is crucial to understand the role of miRNAs in various biological processes including the development of diseases. The present study focuses on the development of a Light Gradient Boost (LGB) based method for the classification of animal pre-miRNAs using various sequence and secondary structural features. In various pre-miRNA families, distinct k-mer repeat signatures with a length of three nucleotides have been identified. Out of nine different classifiers that have been trained and tested in the present study, LGB has an overall better performance with an AUROC of 0.959. In comparison with the existing methods, our method ‘pmiRScan’ has an overall better performance with accuracy of 0.93, sensitivity of 0.86, specificity of 0.95 and F-score of 0.82. Moreover, pmiRScan effectively classifies pre-miRNAs from four distinct taxonomic groups: mammals, nematodes, molluscs and arthropods. We have used our classifier to predict genome-wide pre-miRNAs in human. We find a total of 313 pre-miRNA candidates using pmiRScan. A total of 180 potential mature miRNAs belonging to 60 distinct miRNA families are extracted from predicted pre-miRNAs; of which 128 were novel and are note reported in miRBase. These discoveries may enhance our current understanding of miRNAs and their targets in human. pmiRScan is freely available at http://www.csb.iitkgp.ac.in/applications/pmiRScan/index.php.

pmiRScan:一种基于LightGBM的动物pre- mirna预测方法
MicroRNAs (miRNA)被归类为短的内源性非编码rna,在转录后基因调控中起着重要作用。鉴定新的动物前体miRNA (pre-miRNA)和miRNA对于了解miRNA在包括疾病发展在内的各种生物过程中的作用至关重要。本研究的重点是开发一种基于光梯度增强(LGB)的方法,利用各种序列和二级结构特征对动物pre- mirna进行分类。在各种pre-miRNA家族中,已经鉴定出具有三个核苷酸长度的不同k-mer重复签名。在本研究已经训练和测试的9个不同的分类器中,LGB的总体性能更好,AUROC为0.959。与现有方法相比,我们的方法“pmiRScan”具有更好的整体性能,准确率为0.93,灵敏度为0.86,特异性为0.95,f评分为0.82。此外,pmiRScan还能有效地从哺乳动物、线虫、软体动物和节肢动物四个不同的分类类群中对pre- mirna进行分类。我们已经使用我们的分类器来预测人类全基因组的pre- mirna。我们使用pmiRScan共发现了313个pre-miRNA候选基因。从预测的pre-miRNA中共提取了180个潜在的成熟miRNA,属于60个不同的miRNA家族;其中128个是新发现的,在miRBase中有记录。这些发现可能会增强我们目前对mirna及其在人类中的靶点的理解。pmiRScan可在http://www.csb.iitkgp.ac.in/applications/pmiRScan/index.php免费获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
3.50
自引率
3.40%
发文量
92
审稿时长
2 months
期刊介绍: Functional & Integrative Genomics is devoted to large-scale studies of genomes and their functions, including systems analyses of biological processes. The journal will provide the research community an integrated platform where researchers can share, review and discuss their findings on important biological questions that will ultimately enable us to answer the fundamental question: How do genomes work?
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信