PMiSLocMF: predicting miRNA subcellular localizations by incorporating multi-source features of miRNAs.

IF 6.8 2区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS
Lei Chen, Jiahui Gu, Bo Zhou
{"title":"PMiSLocMF: predicting miRNA subcellular localizations by incorporating multi-source features of miRNAs.","authors":"Lei Chen, Jiahui Gu, Bo Zhou","doi":"10.1093/bib/bbae386","DOIUrl":null,"url":null,"abstract":"<p><p>The microRNAs (miRNAs) play crucial roles in several biological processes. It is essential for a deeper insight into their functions and mechanisms by detecting their subcellular localizations. The traditional methods for determining miRNAs subcellular localizations are expensive. The computational methods are alternative ways to quickly predict miRNAs subcellular localizations. Although several computational methods have been proposed in this regard, the incomplete representations of miRNAs in these methods left the room for improvement. In this study, a novel computational method for predicting miRNA subcellular localizations, named PMiSLocMF, was developed. As lots of miRNAs have multiple subcellular localizations, this method was a multi-label classifier. Several properties of miRNA, such as miRNA sequences, miRNA functional similarity, miRNA-disease, miRNA-drug, and miRNA-mRNA associations were adopted for generating informative miRNA features. To this end, powerful algorithms [node2vec and graph attention auto-encoder (GATE)] and one newly designed scheme were adopted to process above properties, producing five feature types. All features were poured into self-attention and fully connected layers to make predictions. The cross-validation results indicated the high performance of PMiSLocMF with accuracy higher than 0.83, average area under the receiver operating characteristic curve (AUC) and area under the precision-recall curve (AUPR) exceeding 0.90 and 0.77, respectively. Such performance was better than all previous methods based on the same dataset. Further tests proved that using all feature types can improve the performance of PMiSLocMF, and GATE and self-attention layer can help enhance the performance. Finally, we deeply analyzed the influence of miRNA associations with diseases, drugs, and mRNAs on PMiSLocMF. The dataset and codes are available at https://github.com/Gu20201017/PMiSLocMF.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":null,"pages":null},"PeriodicalIF":6.8000,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11330342/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Briefings in bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/bib/bbae386","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

The microRNAs (miRNAs) play crucial roles in several biological processes. It is essential for a deeper insight into their functions and mechanisms by detecting their subcellular localizations. The traditional methods for determining miRNAs subcellular localizations are expensive. The computational methods are alternative ways to quickly predict miRNAs subcellular localizations. Although several computational methods have been proposed in this regard, the incomplete representations of miRNAs in these methods left the room for improvement. In this study, a novel computational method for predicting miRNA subcellular localizations, named PMiSLocMF, was developed. As lots of miRNAs have multiple subcellular localizations, this method was a multi-label classifier. Several properties of miRNA, such as miRNA sequences, miRNA functional similarity, miRNA-disease, miRNA-drug, and miRNA-mRNA associations were adopted for generating informative miRNA features. To this end, powerful algorithms [node2vec and graph attention auto-encoder (GATE)] and one newly designed scheme were adopted to process above properties, producing five feature types. All features were poured into self-attention and fully connected layers to make predictions. The cross-validation results indicated the high performance of PMiSLocMF with accuracy higher than 0.83, average area under the receiver operating characteristic curve (AUC) and area under the precision-recall curve (AUPR) exceeding 0.90 and 0.77, respectively. Such performance was better than all previous methods based on the same dataset. Further tests proved that using all feature types can improve the performance of PMiSLocMF, and GATE and self-attention layer can help enhance the performance. Finally, we deeply analyzed the influence of miRNA associations with diseases, drugs, and mRNAs on PMiSLocMF. The dataset and codes are available at https://github.com/Gu20201017/PMiSLocMF.

PMiSLocMF:通过结合 miRNA 的多源特征预测 miRNA 的亚细胞定位。
微小核糖核酸(miRNA)在多个生物过程中发挥着至关重要的作用。要深入了解它们的功能和机制,就必须检测它们的亚细胞定位。确定 miRNAs 亚细胞定位的传统方法成本高昂。计算方法是快速预测 miRNAs 亚细胞定位的替代方法。虽然在这方面已经提出了几种计算方法,但由于这些方法对 miRNA 的表述不完整,因此还有改进的余地。本研究开发了一种预测 miRNA 亚细胞定位的新型计算方法,命名为 PMiSLocMF。由于很多 miRNA 有多种亚细胞定位,因此该方法是一种多标签分类器。该方法采用了 miRNA 的一些特性,如 miRNA 序列、miRNA 功能相似性、miRNA-疾病、miRNA-药物和 miRNA-mRNA 关联,以生成信息丰富的 miRNA 特征。为此,采用了功能强大的算法(node2vec 和图注意自动编码器(GATE))和一种新设计的方案来处理上述属性,产生了五种特征类型。所有特征都被注入自注意层和全连接层进行预测。交叉验证结果表明,PMiSLocMF 的准确率高于 0.83,平均接收器工作特征曲线下面积(AUC)和精确度-召回曲线下面积(AUPR)分别超过 0.90 和 0.77,表现出较高的性能。这样的表现优于之前所有基于相同数据集的方法。进一步的测试证明,使用所有特征类型都能提高 PMiSLocMF 的性能,而 GATE 和自我注意层也有助于提高性能。最后,我们深入分析了 miRNA 与疾病、药物和 mRNA 的关联对 PMiSLocMF 的影响。数据集和代码见 https://github.com/Gu20201017/PMiSLocMF。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Briefings in bioinformatics
Briefings in bioinformatics 生物-生化研究方法
CiteScore
13.20
自引率
13.70%
发文量
549
审稿时长
6 months
期刊介绍: Briefings in Bioinformatics is an international journal serving as a platform for researchers and educators in the life sciences. It also appeals to mathematicians, statisticians, and computer scientists applying their expertise to biological challenges. The journal focuses on reviews tailored for users of databases and analytical tools in contemporary genetics, molecular and systems biology. It stands out by offering practical assistance and guidance to non-specialists in computerized methodologies. Covering a wide range from introductory concepts to specific protocols and analyses, the papers address bacterial, plant, fungal, animal, and human data. The journal's detailed subject areas include genetic studies of phenotypes and genotypes, mapping, DNA sequencing, expression profiling, gene expression studies, microarrays, alignment methods, protein profiles and HMMs, lipids, metabolic and signaling pathways, structure determination and function prediction, phylogenetic studies, and education and training.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信