LncRNA Subcellular Localization Across Diverse Cell Lines: An Exploration Using Deep Learning with Inexact q-mers.

IF 3.6 Q2 BIOCHEMISTRY & MOLECULAR BIOLOGY
Weijun Yi, Jason R Miller, Gangqing Hu, Donald A Adjeroh
{"title":"LncRNA Subcellular Localization Across Diverse Cell Lines: An Exploration Using Deep Learning with Inexact <i>q</i>-mers.","authors":"Weijun Yi, Jason R Miller, Gangqing Hu, Donald A Adjeroh","doi":"10.3390/ncrna11040049","DOIUrl":null,"url":null,"abstract":"<p><p><b>Background:</b> Long non-coding Ribonucleic Acids (lncRNAs) can be localized to different cellular compartments, such as the nuclear and the cytoplasmic regions. Their biological functions are influenced by the region of the cell where they are located. Compared to the vast number of lncRNAs, only a relatively small proportion have annotations regarding their subcellular localization. It would be helpful if those few annotated lncRNAs could be leveraged to develop predictive models for localization of other lncRNAs. <b>Methods:</b> Conventional computational methods use <i>q</i>-mer profiles from lncRNA sequences and train machine learning models such as support vector machines and logistic regression with the profiles. These methods focus on the exact <i>q</i>-mer. Given possible sequence mutations and other uncertainties in genomic sequences and their role in biological function, a consideration of these variabilities might improve our ability to model lncRNAs and their localization. Thus, we build on inexact <i>q</i>-mers and use machine learning/deep learning techniques to study three specific problems in lncRNA subcellular localization, namely, prediction of lncRNA localization using inexact <i>q</i>-mers, the issue of whether lncRNA localization is cell-type-specific, and the notion of switching (lncRNA) genes. <b>Results:</b> We performed our analysis using data on lncRNA localization across 15 cell lines. Our results showed that using inexact <i>q</i>-mers (with <i>q</i> = 6) can improve the lncRNA localization prediction performance compared to using exact <i>q</i>-mers. Further, we showed that lncRNA localization, in general, is not cell-line-specific. We also identified a category of LncRNAs which switch cellular compartments between different cell lines (we call them switching lncRNAs). These switching lncRNAs complicate the problem of predicting lncRNA localization using machine learning models, showing that lncRNA localization is still a major challenge.</p>","PeriodicalId":19271,"journal":{"name":"Non-Coding RNA","volume":"11 4","pages":""},"PeriodicalIF":3.6000,"publicationDate":"2025-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12286058/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Non-Coding RNA","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/ncrna11040049","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Long non-coding Ribonucleic Acids (lncRNAs) can be localized to different cellular compartments, such as the nuclear and the cytoplasmic regions. Their biological functions are influenced by the region of the cell where they are located. Compared to the vast number of lncRNAs, only a relatively small proportion have annotations regarding their subcellular localization. It would be helpful if those few annotated lncRNAs could be leveraged to develop predictive models for localization of other lncRNAs. Methods: Conventional computational methods use q-mer profiles from lncRNA sequences and train machine learning models such as support vector machines and logistic regression with the profiles. These methods focus on the exact q-mer. Given possible sequence mutations and other uncertainties in genomic sequences and their role in biological function, a consideration of these variabilities might improve our ability to model lncRNAs and their localization. Thus, we build on inexact q-mers and use machine learning/deep learning techniques to study three specific problems in lncRNA subcellular localization, namely, prediction of lncRNA localization using inexact q-mers, the issue of whether lncRNA localization is cell-type-specific, and the notion of switching (lncRNA) genes. Results: We performed our analysis using data on lncRNA localization across 15 cell lines. Our results showed that using inexact q-mers (with q = 6) can improve the lncRNA localization prediction performance compared to using exact q-mers. Further, we showed that lncRNA localization, in general, is not cell-line-specific. We also identified a category of LncRNAs which switch cellular compartments between different cell lines (we call them switching lncRNAs). These switching lncRNAs complicate the problem of predicting lncRNA localization using machine learning models, showing that lncRNA localization is still a major challenge.

跨不同细胞系的LncRNA亚细胞定位:利用不精确q-mers进行深度学习的探索。
背景:长链非编码核糖核酸(lncRNAs)可以定位于不同的细胞区室,如细胞核和细胞质区域。它们的生物学功能受到它们所在细胞区域的影响。与大量的lncrna相比,只有相对较小比例的lncrna有亚细胞定位的注释。如果能够利用这几个注释lncrna来开发其他lncrna定位的预测模型,这将是有帮助的。方法:传统的计算方法使用lncRNA序列的q-mer谱,并使用这些谱训练机器学习模型,如支持向量机和逻辑回归。这些方法关注的是精确的q-mer。考虑到基因组序列中可能的序列突变和其他不确定性及其在生物学功能中的作用,考虑这些变异可能会提高我们对lncrna建模及其定位的能力。因此,我们以不精确q-mers为基础,利用机器学习/深度学习技术研究lncRNA亚细胞定位中的三个具体问题,即使用不精确q-mers预测lncRNA定位,lncRNA定位是否具有细胞类型特异性的问题,以及开关(lncRNA)基因的概念。结果:我们使用15个细胞系的lncRNA定位数据进行了分析。我们的研究结果表明,与使用精确q-mers相比,使用不精确q-mers (q = 6)可以提高lncRNA定位预测的性能。此外,我们发现lncRNA的定位通常不是细胞系特异性的。我们还确定了一类在不同细胞系之间切换细胞室的lncrna(我们称之为切换lncrna)。这些切换lncRNA使使用机器学习模型预测lncRNA定位的问题复杂化,表明lncRNA定位仍然是一个主要挑战。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Non-Coding RNA
Non-Coding RNA Biochemistry, Genetics and Molecular Biology-Genetics
CiteScore
6.70
自引率
4.70%
发文量
74
审稿时长
10 weeks
期刊介绍: Functional studies dealing with identification, structure-function relationships or biological activity of: small regulatory RNAs (miRNAs, siRNAs and piRNAs) associated with the RNA interference pathway small nuclear RNAs, small nucleolar and tRNAs derived small RNAs other types of small RNAs, such as those associated with splice junctions and transcription start sites long non-coding RNAs, including antisense RNAs, long ''intergenic'' RNAs, intronic RNAs and ''enhancer'' RNAs other classes of RNAs such as vault RNAs, scaRNAs, circular RNAs, 7SL RNAs, telomeric and centromeric RNAs regulatory functions of mRNAs and UTR-derived RNAs catalytic and allosteric (riboswitch) RNAs viral, transposon and repeat-derived RNAs bacterial regulatory RNAs, including CRISPR RNAS Analysis of RNA processing, RNA binding proteins, RNA signaling and RNA interaction pathways: DICER AGO, PIWI and PIWI-like proteins other classes of RNA binding and RNA transport proteins RNA interactions with chromatin-modifying complexes RNA interactions with DNA and other RNAs the role of RNA in the formation and function of specialized subnuclear organelles and other aspects of cell biology intercellular and intergenerational RNA signaling RNA processing structure-function relationships in RNA complexes RNA analyses, informatics, tools and technologies: transcriptomic analyses and technologies development of tools and technologies for RNA biology and therapeutics Translational studies involving long and short non-coding RNAs: identification of biomarkers development of new therapies involving microRNAs and other ncRNAs clinical studies involving microRNAs and other ncRNAs.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信