AttenRNA: multi-scale deep attentive model with RNA feature variability analysis.

IF 6.8 2区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS
Jing Li, Quan Zou, Chao Zhan
{"title":"AttenRNA: multi-scale deep attentive model with RNA feature variability analysis.","authors":"Jing Li, Quan Zou, Chao Zhan","doi":"10.1093/bib/bbaf336","DOIUrl":null,"url":null,"abstract":"<p><p>Accurate identification of diverse RNA types, including messenger RNAs (mRNAs), long non-coding RNAs (lncRNAs), and circular RNAs (circRNAs), is essential for understanding their roles in gene regulation, disease progression, and epigenetic modification. Existing studies have primarily focused on binary classification tasks, such as distinguishing lncRNAs from mRNAs or identifying specific circRNAs, often overlooking the complex sequence patterns shared across multiple RNA types. To address this limitation, we developed AttenRNA, a multi-class classification model that integrates multi-scale k-mer embeddings and attention mechanisms to simultaneously differentiate between various RNA classes. AttenRNA achieved high weighted F1 scores of 89.8% and 89.6% on the validation and test sets, respectively, demonstrating strong classification performance and robustness. Dimensionality reduction using Uniform Manifold Approximation and Projection further confirmed the model's ability to learn discriminative features among RNA types. Additionally, AttenRNA exhibited strong generalization ability on cross-species data, achieving weighted F1 scores of 83.89% and 83.38% on the mouse RNA validation and test sets, respectively. These results suggest that AttenRNA offers a reliable and scalable solution for systematic RNA function analysis.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 4","pages":""},"PeriodicalIF":6.8000,"publicationDate":"2025-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12240734/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Briefings in bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/bib/bbaf336","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

Accurate identification of diverse RNA types, including messenger RNAs (mRNAs), long non-coding RNAs (lncRNAs), and circular RNAs (circRNAs), is essential for understanding their roles in gene regulation, disease progression, and epigenetic modification. Existing studies have primarily focused on binary classification tasks, such as distinguishing lncRNAs from mRNAs or identifying specific circRNAs, often overlooking the complex sequence patterns shared across multiple RNA types. To address this limitation, we developed AttenRNA, a multi-class classification model that integrates multi-scale k-mer embeddings and attention mechanisms to simultaneously differentiate between various RNA classes. AttenRNA achieved high weighted F1 scores of 89.8% and 89.6% on the validation and test sets, respectively, demonstrating strong classification performance and robustness. Dimensionality reduction using Uniform Manifold Approximation and Projection further confirmed the model's ability to learn discriminative features among RNA types. Additionally, AttenRNA exhibited strong generalization ability on cross-species data, achieving weighted F1 scores of 83.89% and 83.38% on the mouse RNA validation and test sets, respectively. These results suggest that AttenRNA offers a reliable and scalable solution for systematic RNA function analysis.

AttenRNA:基于RNA特征变异分析的多尺度深度关注模型。
准确识别各种RNA类型,包括信使RNA (mrna)、长链非编码RNA (lncRNAs)和环状RNA (circRNAs),对于理解它们在基因调控、疾病进展和表观遗传修饰中的作用至关重要。现有的研究主要集中在二元分类任务上,如区分lncrna和mrna或识别特定的环状RNA,往往忽略了多种RNA类型共享的复杂序列模式。为了解决这一限制,我们开发了AttenRNA,这是一个集成了多尺度k-mer嵌入和注意机制的多类分类模型,可以同时区分不同的RNA类别。AttenRNA在验证集和测试集上分别获得了89.8%和89.6%的高权重F1分数,表现出较强的分类性能和鲁棒性。使用均匀流形近似和投影的降维进一步证实了该模型学习RNA类型之间判别特征的能力。此外,AttenRNA在跨物种数据上表现出较强的泛化能力,在小鼠RNA验证集和测试集上的F1加权得分分别为83.89%和83.38%。这些结果表明,AttenRNA为系统的RNA功能分析提供了可靠和可扩展的解决方案。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Briefings in bioinformatics
Briefings in bioinformatics 生物-生化研究方法
CiteScore
13.20
自引率
13.70%
发文量
549
审稿时长
6 months
期刊介绍: Briefings in Bioinformatics is an international journal serving as a platform for researchers and educators in the life sciences. It also appeals to mathematicians, statisticians, and computer scientists applying their expertise to biological challenges. The journal focuses on reviews tailored for users of databases and analytical tools in contemporary genetics, molecular and systems biology. It stands out by offering practical assistance and guidance to non-specialists in computerized methodologies. Covering a wide range from introductory concepts to specific protocols and analyses, the papers address bacterial, plant, fungal, animal, and human data. The journal's detailed subject areas include genetic studies of phenotypes and genotypes, mapping, DNA sequencing, expression profiling, gene expression studies, microarrays, alignment methods, protein profiles and HMMs, lipids, metabolic and signaling pathways, structure determination and function prediction, phylogenetic studies, and education and training.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信