Yiyou Song, Bowen Song, Daiyun Huang, Anh Nguyen, Lihong Hu, Jia Meng, Yue Wang
{"title":"Multimodal zero-shot learning of previously unseen epitranscriptomes from RNA-seq data.","authors":"Yiyou Song, Bowen Song, Daiyun Huang, Anh Nguyen, Lihong Hu, Jia Meng, Yue Wang","doi":"10.1093/bib/bbaf332","DOIUrl":null,"url":null,"abstract":"<p><p>Precise identification of condition-specific epitranscriptomes is of critical importance for investigating the dynamics and versatile functions of RNA modification under various biological contexts. Existing approaches for predicting condition-specific RNA modification are usually trained on epitranscriptome data obtained from the same condition, which limited their usage, as such data are available only for a small number of conditions due to the technical difficulties and high expenses of epitranscriptome profiling technologies. We present ExpressRM, a multimodal zero-shot learning framework for predicting condition-specific RNA modification sites in previously unseen contexts from genome and RNA-seq data. Different from existing in-condition learning approaches, this method does not rely on matched epitranscriptome data for training, which greatly expands its applicability. On a benchmark dataset comprising epitranscriptomes and matched transcriptomes of 37 human tissues, we demonstrate that ExpressRM can accurately predict epitranscriptomes of previously unseen conditions from their transcriptomes only, and the performance is comparable to existing in-condition learning algorithms that require epitranscriptome data from the same condition. Additionally, the method has the capability of differentiating highly dynamic RNA methylation sites from more static (or house-keeping) ones. With a case study, we show that ExpressRM can uncover N6-methyladenosine RNA methylation sites in glioblastoma using only its RNA-seq data, and unveils novel and previously validated pathological insights. Together, these results suggest that the proposed multimodal zero-shot learning framework can effectively leverage transcriptome knowledge to explore the dynamic roles of RNA modifications in previously unseen experimental setups, providing valuable insights into vast biological contexts where RNA-seq is routinely used but epitranscriptome profiling has not yet been covered.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 4","pages":""},"PeriodicalIF":6.8000,"publicationDate":"2025-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12239634/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Briefings in bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/bib/bbaf332","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Precise identification of condition-specific epitranscriptomes is of critical importance for investigating the dynamics and versatile functions of RNA modification under various biological contexts. Existing approaches for predicting condition-specific RNA modification are usually trained on epitranscriptome data obtained from the same condition, which limited their usage, as such data are available only for a small number of conditions due to the technical difficulties and high expenses of epitranscriptome profiling technologies. We present ExpressRM, a multimodal zero-shot learning framework for predicting condition-specific RNA modification sites in previously unseen contexts from genome and RNA-seq data. Different from existing in-condition learning approaches, this method does not rely on matched epitranscriptome data for training, which greatly expands its applicability. On a benchmark dataset comprising epitranscriptomes and matched transcriptomes of 37 human tissues, we demonstrate that ExpressRM can accurately predict epitranscriptomes of previously unseen conditions from their transcriptomes only, and the performance is comparable to existing in-condition learning algorithms that require epitranscriptome data from the same condition. Additionally, the method has the capability of differentiating highly dynamic RNA methylation sites from more static (or house-keeping) ones. With a case study, we show that ExpressRM can uncover N6-methyladenosine RNA methylation sites in glioblastoma using only its RNA-seq data, and unveils novel and previously validated pathological insights. Together, these results suggest that the proposed multimodal zero-shot learning framework can effectively leverage transcriptome knowledge to explore the dynamic roles of RNA modifications in previously unseen experimental setups, providing valuable insights into vast biological contexts where RNA-seq is routinely used but epitranscriptome profiling has not yet been covered.
期刊介绍:
Briefings in Bioinformatics is an international journal serving as a platform for researchers and educators in the life sciences. It also appeals to mathematicians, statisticians, and computer scientists applying their expertise to biological challenges. The journal focuses on reviews tailored for users of databases and analytical tools in contemporary genetics, molecular and systems biology. It stands out by offering practical assistance and guidance to non-specialists in computerized methodologies. Covering a wide range from introductory concepts to specific protocols and analyses, the papers address bacterial, plant, fungal, animal, and human data.
The journal's detailed subject areas include genetic studies of phenotypes and genotypes, mapping, DNA sequencing, expression profiling, gene expression studies, microarrays, alignment methods, protein profiles and HMMs, lipids, metabolic and signaling pathways, structure determination and function prediction, phylogenetic studies, and education and training.