Weijun Yi, Jason R Miller, Gangqing Hu, Donald A Adjeroh
{"title":"LncRNA Subcellular Localization Across Diverse Cell Lines: An Exploration Using Deep Learning with Inexact <i>q</i>-mers.","authors":"Weijun Yi, Jason R Miller, Gangqing Hu, Donald A Adjeroh","doi":"10.3390/ncrna11040049","DOIUrl":null,"url":null,"abstract":"<p><p><b>Background:</b> Long non-coding Ribonucleic Acids (lncRNAs) can be localized to different cellular compartments, such as the nuclear and the cytoplasmic regions. Their biological functions are influenced by the region of the cell where they are located. Compared to the vast number of lncRNAs, only a relatively small proportion have annotations regarding their subcellular localization. It would be helpful if those few annotated lncRNAs could be leveraged to develop predictive models for localization of other lncRNAs. <b>Methods:</b> Conventional computational methods use <i>q</i>-mer profiles from lncRNA sequences and train machine learning models such as support vector machines and logistic regression with the profiles. These methods focus on the exact <i>q</i>-mer. Given possible sequence mutations and other uncertainties in genomic sequences and their role in biological function, a consideration of these variabilities might improve our ability to model lncRNAs and their localization. Thus, we build on inexact <i>q</i>-mers and use machine learning/deep learning techniques to study three specific problems in lncRNA subcellular localization, namely, prediction of lncRNA localization using inexact <i>q</i>-mers, the issue of whether lncRNA localization is cell-type-specific, and the notion of switching (lncRNA) genes. <b>Results:</b> We performed our analysis using data on lncRNA localization across 15 cell lines. Our results showed that using inexact <i>q</i>-mers (with <i>q</i> = 6) can improve the lncRNA localization prediction performance compared to using exact <i>q</i>-mers. Further, we showed that lncRNA localization, in general, is not cell-line-specific. We also identified a category of LncRNAs which switch cellular compartments between different cell lines (we call them switching lncRNAs). These switching lncRNAs complicate the problem of predicting lncRNA localization using machine learning models, showing that lncRNA localization is still a major challenge.</p>","PeriodicalId":19271,"journal":{"name":"Non-Coding RNA","volume":"11 4","pages":""},"PeriodicalIF":3.6000,"publicationDate":"2025-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12286058/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Non-Coding RNA","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/ncrna11040049","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Long non-coding Ribonucleic Acids (lncRNAs) can be localized to different cellular compartments, such as the nuclear and the cytoplasmic regions. Their biological functions are influenced by the region of the cell where they are located. Compared to the vast number of lncRNAs, only a relatively small proportion have annotations regarding their subcellular localization. It would be helpful if those few annotated lncRNAs could be leveraged to develop predictive models for localization of other lncRNAs. Methods: Conventional computational methods use q-mer profiles from lncRNA sequences and train machine learning models such as support vector machines and logistic regression with the profiles. These methods focus on the exact q-mer. Given possible sequence mutations and other uncertainties in genomic sequences and their role in biological function, a consideration of these variabilities might improve our ability to model lncRNAs and their localization. Thus, we build on inexact q-mers and use machine learning/deep learning techniques to study three specific problems in lncRNA subcellular localization, namely, prediction of lncRNA localization using inexact q-mers, the issue of whether lncRNA localization is cell-type-specific, and the notion of switching (lncRNA) genes. Results: We performed our analysis using data on lncRNA localization across 15 cell lines. Our results showed that using inexact q-mers (with q = 6) can improve the lncRNA localization prediction performance compared to using exact q-mers. Further, we showed that lncRNA localization, in general, is not cell-line-specific. We also identified a category of LncRNAs which switch cellular compartments between different cell lines (we call them switching lncRNAs). These switching lncRNAs complicate the problem of predicting lncRNA localization using machine learning models, showing that lncRNA localization is still a major challenge.
Non-Coding RNABiochemistry, Genetics and Molecular Biology-Genetics
CiteScore
6.70
自引率
4.70%
发文量
74
审稿时长
10 weeks
期刊介绍:
Functional studies dealing with identification, structure-function relationships or biological activity of: small regulatory RNAs (miRNAs, siRNAs and piRNAs) associated with the RNA interference pathway small nuclear RNAs, small nucleolar and tRNAs derived small RNAs other types of small RNAs, such as those associated with splice junctions and transcription start sites long non-coding RNAs, including antisense RNAs, long ''intergenic'' RNAs, intronic RNAs and ''enhancer'' RNAs other classes of RNAs such as vault RNAs, scaRNAs, circular RNAs, 7SL RNAs, telomeric and centromeric RNAs regulatory functions of mRNAs and UTR-derived RNAs catalytic and allosteric (riboswitch) RNAs viral, transposon and repeat-derived RNAs bacterial regulatory RNAs, including CRISPR RNAS Analysis of RNA processing, RNA binding proteins, RNA signaling and RNA interaction pathways: DICER AGO, PIWI and PIWI-like proteins other classes of RNA binding and RNA transport proteins RNA interactions with chromatin-modifying complexes RNA interactions with DNA and other RNAs the role of RNA in the formation and function of specialized subnuclear organelles and other aspects of cell biology intercellular and intergenerational RNA signaling RNA processing structure-function relationships in RNA complexes RNA analyses, informatics, tools and technologies: transcriptomic analyses and technologies development of tools and technologies for RNA biology and therapeutics Translational studies involving long and short non-coding RNAs: identification of biomarkers development of new therapies involving microRNAs and other ncRNAs clinical studies involving microRNAs and other ncRNAs.