聚腺苷酸化代码的深度学习建模研究进展。

IF 6.4 2区 生物学 Q1 CELL BIOLOGY
Emily Kunce Stroup, Tianjiao Sun, Qianru Li, John Carinato, Zhe Ji
{"title":"聚腺苷酸化代码的深度学习建模研究进展。","authors":"Emily Kunce Stroup, Tianjiao Sun, Qianru Li, John Carinato, Zhe Ji","doi":"10.1002/wrna.70017","DOIUrl":null,"url":null,"abstract":"<p><p>3'-end cleavage and polyadenylation is an essential step of eukaryotic mRNA and lncRNA expression. The formation of a polyadenylation (polyA) site is determined by combinatory effects of multiple tandem motifs (~6 motifs in humans), each of which is bound by a protein subcomplex. However, motif occurrences and compositions are quite variable across individual polyA sites, leading to the technical challenge of quantifying polyadenylation activities and defining cleavage sites. Although conventional motif enrichment analyses and machine learning models identified contributing polyadenylation motifs, these cannot unbiasedly quantify motif crosstalk. Recently, several groups developed deep learning models to resolve sequence complexity, capture complex positional interactions among cis-regulatory motifs, examine polyA site formation, predict cleavage probability, and calculate site strength. These deep learning models have brought novel insights into polyadenylation biology, such as site configuration differences across species, cleavage heterogeneity, genomic parameters regulating site expression, and human genetic variants altering polyadenylation activities. In this review, we summarize the advances of deep learning models developed to address facets of polyadenylation regulation and discuss applications of the models.</p>","PeriodicalId":23886,"journal":{"name":"Wiley Interdisciplinary Reviews: RNA","volume":"16 3","pages":"e70017"},"PeriodicalIF":6.4000,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12138237/pdf/","citationCount":"0","resultStr":"{\"title\":\"The Advances in Deep Learning Modeling of Polyadenylation Codes.\",\"authors\":\"Emily Kunce Stroup, Tianjiao Sun, Qianru Li, John Carinato, Zhe Ji\",\"doi\":\"10.1002/wrna.70017\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>3'-end cleavage and polyadenylation is an essential step of eukaryotic mRNA and lncRNA expression. The formation of a polyadenylation (polyA) site is determined by combinatory effects of multiple tandem motifs (~6 motifs in humans), each of which is bound by a protein subcomplex. However, motif occurrences and compositions are quite variable across individual polyA sites, leading to the technical challenge of quantifying polyadenylation activities and defining cleavage sites. Although conventional motif enrichment analyses and machine learning models identified contributing polyadenylation motifs, these cannot unbiasedly quantify motif crosstalk. Recently, several groups developed deep learning models to resolve sequence complexity, capture complex positional interactions among cis-regulatory motifs, examine polyA site formation, predict cleavage probability, and calculate site strength. These deep learning models have brought novel insights into polyadenylation biology, such as site configuration differences across species, cleavage heterogeneity, genomic parameters regulating site expression, and human genetic variants altering polyadenylation activities. In this review, we summarize the advances of deep learning models developed to address facets of polyadenylation regulation and discuss applications of the models.</p>\",\"PeriodicalId\":23886,\"journal\":{\"name\":\"Wiley Interdisciplinary Reviews: RNA\",\"volume\":\"16 3\",\"pages\":\"e70017\"},\"PeriodicalIF\":6.4000,\"publicationDate\":\"2025-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12138237/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Wiley Interdisciplinary Reviews: RNA\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1002/wrna.70017\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CELL BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Wiley Interdisciplinary Reviews: RNA","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1002/wrna.70017","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CELL BIOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

3′端切割和多聚腺苷化是真核生物mRNA和lncRNA表达的重要步骤。聚腺苷酸化(polyA)位点的形成是由多个串联基序(人类约6个基序)的组合作用决定的,每个基序都由一个蛋白质亚复合物结合。然而,在单个聚a位点上,基序的出现和组成是非常不同的,这导致了定量聚腺苷化活性和定义切割位点的技术挑战。虽然传统的基序富集分析和机器学习模型确定了贡献聚腺苷化的基序,但这些不能公正地量化基序串扰。最近,一些研究小组开发了深度学习模型来解决序列复杂性,捕获顺式调控基序之间复杂的位置相互作用,检查polyA位点形成,预测切割概率,并计算位点强度。这些深度学习模型为多聚腺苷酸化生物学带来了新的见解,例如不同物种之间的位点配置差异、切割异质性、调节位点表达的基因组参数以及改变多聚腺苷酸化活性的人类遗传变异。在这篇综述中,我们总结了深度学习模型的进展,以解决多聚腺苷化调节的各个方面,并讨论了模型的应用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
The Advances in Deep Learning Modeling of Polyadenylation Codes.

3'-end cleavage and polyadenylation is an essential step of eukaryotic mRNA and lncRNA expression. The formation of a polyadenylation (polyA) site is determined by combinatory effects of multiple tandem motifs (~6 motifs in humans), each of which is bound by a protein subcomplex. However, motif occurrences and compositions are quite variable across individual polyA sites, leading to the technical challenge of quantifying polyadenylation activities and defining cleavage sites. Although conventional motif enrichment analyses and machine learning models identified contributing polyadenylation motifs, these cannot unbiasedly quantify motif crosstalk. Recently, several groups developed deep learning models to resolve sequence complexity, capture complex positional interactions among cis-regulatory motifs, examine polyA site formation, predict cleavage probability, and calculate site strength. These deep learning models have brought novel insights into polyadenylation biology, such as site configuration differences across species, cleavage heterogeneity, genomic parameters regulating site expression, and human genetic variants altering polyadenylation activities. In this review, we summarize the advances of deep learning models developed to address facets of polyadenylation regulation and discuss applications of the models.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
14.80
自引率
4.10%
发文量
67
审稿时长
6-12 weeks
期刊介绍: WIREs RNA aims to provide comprehensive, up-to-date, and coherent coverage of this interesting and growing field, providing a framework for both RNA experts and interdisciplinary researchers to not only gain perspective in areas of RNA biology, but to generate new insights and applications as well. Major topics to be covered are: RNA Structure and Dynamics; RNA Evolution and Genomics; RNA-Based Catalysis; RNA Interactions with Proteins and Other Molecules; Translation; RNA Processing; RNA Export/Localization; RNA Turnover and Surveillance; Regulatory RNAs/RNAi/Riboswitches; RNA in Disease and Development; and RNA Methods.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信