LncSTPred:lncRNA亚细胞定位的预测模型和影响定位的生物决定因素的破译

IF 3.9 3区 生物学 Q2 BIOCHEMISTRY & MOLECULAR BIOLOGY
Si-Le Hu, Ying-Li Chen, Lu-Qiang Zhang, Hui Bai, Jia-Hong Yang, Qian-Zhong Li
{"title":"LncSTPred:lncRNA亚细胞定位的预测模型和影响定位的生物决定因素的破译","authors":"Si-Le Hu, Ying-Li Chen, Lu-Qiang Zhang, Hui Bai, Jia-Hong Yang, Qian-Zhong Li","doi":"10.3389/fmolb.2024.1452142","DOIUrl":null,"url":null,"abstract":"IntroductionLong non-coding RNAs (lncRNAs) play crucial roles in genetic markers, genome rearrangement, chromatin modifications, and other biological processes. Increasing evidence suggests that lncRNA functions are closely related to their subcellular localization. However, the distribution of lncRNAs in different subcellular localizations is imbalanced. The number of lncRNAs located in the nucleus is more than ten times that in the exosome.MethodsIn this study, we propose a new oversampling method to construct a predictive dataset and develop a predictive model called LncSTPred. This model improves the Adaboost algorithm for subcellular localization prediction using 3-mer, 3-RF sequence, and minimum free energy structure features.Results and DiscussionBy using our improved Adaboost algorithm, better prediction accuracy for lncRNA subcellular localization was obtained. In addition, we evaluated feature importance by using the F-score and analyzed the influence of highly relevant features on lncRNAs. Our study shows that the ANA features may be a key factor for predicting lncRNA subcellular localization, which correlates with the composition of stems and loops in the secondary structure of lncRNAs.","PeriodicalId":12465,"journal":{"name":"Frontiers in Molecular Biosciences","volume":"290 1","pages":""},"PeriodicalIF":3.9000,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"LncSTPred: a predictive model of lncRNA subcellular localization and decipherment of the biological determinants influencing localization\",\"authors\":\"Si-Le Hu, Ying-Li Chen, Lu-Qiang Zhang, Hui Bai, Jia-Hong Yang, Qian-Zhong Li\",\"doi\":\"10.3389/fmolb.2024.1452142\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"IntroductionLong non-coding RNAs (lncRNAs) play crucial roles in genetic markers, genome rearrangement, chromatin modifications, and other biological processes. Increasing evidence suggests that lncRNA functions are closely related to their subcellular localization. However, the distribution of lncRNAs in different subcellular localizations is imbalanced. The number of lncRNAs located in the nucleus is more than ten times that in the exosome.MethodsIn this study, we propose a new oversampling method to construct a predictive dataset and develop a predictive model called LncSTPred. This model improves the Adaboost algorithm for subcellular localization prediction using 3-mer, 3-RF sequence, and minimum free energy structure features.Results and DiscussionBy using our improved Adaboost algorithm, better prediction accuracy for lncRNA subcellular localization was obtained. In addition, we evaluated feature importance by using the F-score and analyzed the influence of highly relevant features on lncRNAs. Our study shows that the ANA features may be a key factor for predicting lncRNA subcellular localization, which correlates with the composition of stems and loops in the secondary structure of lncRNAs.\",\"PeriodicalId\":12465,\"journal\":{\"name\":\"Frontiers in Molecular Biosciences\",\"volume\":\"290 1\",\"pages\":\"\"},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2024-09-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Frontiers in Molecular Biosciences\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.3389/fmolb.2024.1452142\",\"RegionNum\":3,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"BIOCHEMISTRY & MOLECULAR BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Molecular Biosciences","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.3389/fmolb.2024.1452142","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

引言 长非编码 RNA(lncRNA)在遗传标记、基因组重排、染色质修饰和其他生物过程中发挥着至关重要的作用。越来越多的证据表明,lncRNA的功能与其亚细胞定位密切相关。然而,lncRNA 在不同亚细胞定位中的分布是不平衡的。在本研究中,我们提出了一种新的超采样方法来构建预测数据集,并开发了一种名为 LncSTPred 的预测模型。该模型利用3-mer、3-RF序列和最小自由能结构特征改进了用于亚细胞定位预测的Adaboost算法。结果与讨论通过使用我们改进的Adaboost算法,获得了更好的lncRNA亚细胞定位预测准确率。此外,我们还利用 F 分数评估了特征的重要性,并分析了高相关性特征对 lncRNA 的影响。我们的研究表明,ANA特征可能是预测lncRNA亚细胞定位的一个关键因素,它与lncRNA二级结构中茎和环的组成相关。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
LncSTPred: a predictive model of lncRNA subcellular localization and decipherment of the biological determinants influencing localization
IntroductionLong non-coding RNAs (lncRNAs) play crucial roles in genetic markers, genome rearrangement, chromatin modifications, and other biological processes. Increasing evidence suggests that lncRNA functions are closely related to their subcellular localization. However, the distribution of lncRNAs in different subcellular localizations is imbalanced. The number of lncRNAs located in the nucleus is more than ten times that in the exosome.MethodsIn this study, we propose a new oversampling method to construct a predictive dataset and develop a predictive model called LncSTPred. This model improves the Adaboost algorithm for subcellular localization prediction using 3-mer, 3-RF sequence, and minimum free energy structure features.Results and DiscussionBy using our improved Adaboost algorithm, better prediction accuracy for lncRNA subcellular localization was obtained. In addition, we evaluated feature importance by using the F-score and analyzed the influence of highly relevant features on lncRNAs. Our study shows that the ANA features may be a key factor for predicting lncRNA subcellular localization, which correlates with the composition of stems and loops in the secondary structure of lncRNAs.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Frontiers in Molecular Biosciences
Frontiers in Molecular Biosciences Biochemistry, Genetics and Molecular Biology-Biochemistry
CiteScore
7.20
自引率
4.00%
发文量
1361
审稿时长
14 weeks
期刊介绍: Much of contemporary investigation in the life sciences is devoted to the molecular-scale understanding of the relationships between genes and the environment — in particular, dynamic alterations in the levels, modifications, and interactions of cellular effectors, including proteins. Frontiers in Molecular Biosciences offers an international publication platform for basic as well as applied research; we encourage contributions spanning both established and emerging areas of biology. To this end, the journal draws from empirical disciplines such as structural biology, enzymology, biochemistry, and biophysics, capitalizing as well on the technological advancements that have enabled metabolomics and proteomics measurements in massively parallel throughput, and the development of robust and innovative computational biology strategies. We also recognize influences from medicine and technology, welcoming studies in molecular genetics, molecular diagnostics and therapeutics, and nanotechnology. Our ultimate objective is the comprehensive illustration of the molecular mechanisms regulating proteins, nucleic acids, carbohydrates, lipids, and small metabolites in organisms across all branches of life. In addition to interesting new findings, techniques, and applications, Frontiers in Molecular Biosciences will consider new testable hypotheses to inspire different perspectives and stimulate scientific dialogue. The integration of in silico, in vitro, and in vivo approaches will benefit endeavors across all domains of the life sciences.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信