Emily Kunce Stroup, Tianjiao Sun, Qianru Li, John Carinato, Zhe Ji
{"title":"聚腺苷酸化代码的深度学习建模研究进展。","authors":"Emily Kunce Stroup, Tianjiao Sun, Qianru Li, John Carinato, Zhe Ji","doi":"10.1002/wrna.70017","DOIUrl":null,"url":null,"abstract":"<p><p>3'-end cleavage and polyadenylation is an essential step of eukaryotic mRNA and lncRNA expression. The formation of a polyadenylation (polyA) site is determined by combinatory effects of multiple tandem motifs (~6 motifs in humans), each of which is bound by a protein subcomplex. However, motif occurrences and compositions are quite variable across individual polyA sites, leading to the technical challenge of quantifying polyadenylation activities and defining cleavage sites. Although conventional motif enrichment analyses and machine learning models identified contributing polyadenylation motifs, these cannot unbiasedly quantify motif crosstalk. Recently, several groups developed deep learning models to resolve sequence complexity, capture complex positional interactions among cis-regulatory motifs, examine polyA site formation, predict cleavage probability, and calculate site strength. These deep learning models have brought novel insights into polyadenylation biology, such as site configuration differences across species, cleavage heterogeneity, genomic parameters regulating site expression, and human genetic variants altering polyadenylation activities. In this review, we summarize the advances of deep learning models developed to address facets of polyadenylation regulation and discuss applications of the models.</p>","PeriodicalId":23886,"journal":{"name":"Wiley Interdisciplinary Reviews: RNA","volume":"16 3","pages":"e70017"},"PeriodicalIF":6.4000,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12138237/pdf/","citationCount":"0","resultStr":"{\"title\":\"The Advances in Deep Learning Modeling of Polyadenylation Codes.\",\"authors\":\"Emily Kunce Stroup, Tianjiao Sun, Qianru Li, John Carinato, Zhe Ji\",\"doi\":\"10.1002/wrna.70017\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>3'-end cleavage and polyadenylation is an essential step of eukaryotic mRNA and lncRNA expression. The formation of a polyadenylation (polyA) site is determined by combinatory effects of multiple tandem motifs (~6 motifs in humans), each of which is bound by a protein subcomplex. However, motif occurrences and compositions are quite variable across individual polyA sites, leading to the technical challenge of quantifying polyadenylation activities and defining cleavage sites. Although conventional motif enrichment analyses and machine learning models identified contributing polyadenylation motifs, these cannot unbiasedly quantify motif crosstalk. Recently, several groups developed deep learning models to resolve sequence complexity, capture complex positional interactions among cis-regulatory motifs, examine polyA site formation, predict cleavage probability, and calculate site strength. These deep learning models have brought novel insights into polyadenylation biology, such as site configuration differences across species, cleavage heterogeneity, genomic parameters regulating site expression, and human genetic variants altering polyadenylation activities. In this review, we summarize the advances of deep learning models developed to address facets of polyadenylation regulation and discuss applications of the models.</p>\",\"PeriodicalId\":23886,\"journal\":{\"name\":\"Wiley Interdisciplinary Reviews: RNA\",\"volume\":\"16 3\",\"pages\":\"e70017\"},\"PeriodicalIF\":6.4000,\"publicationDate\":\"2025-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12138237/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Wiley Interdisciplinary Reviews: RNA\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1002/wrna.70017\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CELL BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Wiley Interdisciplinary Reviews: RNA","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1002/wrna.70017","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CELL BIOLOGY","Score":null,"Total":0}
The Advances in Deep Learning Modeling of Polyadenylation Codes.
3'-end cleavage and polyadenylation is an essential step of eukaryotic mRNA and lncRNA expression. The formation of a polyadenylation (polyA) site is determined by combinatory effects of multiple tandem motifs (~6 motifs in humans), each of which is bound by a protein subcomplex. However, motif occurrences and compositions are quite variable across individual polyA sites, leading to the technical challenge of quantifying polyadenylation activities and defining cleavage sites. Although conventional motif enrichment analyses and machine learning models identified contributing polyadenylation motifs, these cannot unbiasedly quantify motif crosstalk. Recently, several groups developed deep learning models to resolve sequence complexity, capture complex positional interactions among cis-regulatory motifs, examine polyA site formation, predict cleavage probability, and calculate site strength. These deep learning models have brought novel insights into polyadenylation biology, such as site configuration differences across species, cleavage heterogeneity, genomic parameters regulating site expression, and human genetic variants altering polyadenylation activities. In this review, we summarize the advances of deep learning models developed to address facets of polyadenylation regulation and discuss applications of the models.
期刊介绍:
WIREs RNA aims to provide comprehensive, up-to-date, and coherent coverage of this interesting and growing field, providing a framework for both RNA experts and interdisciplinary researchers to not only gain perspective in areas of RNA biology, but to generate new insights and applications as well. Major topics to be covered are: RNA Structure and Dynamics; RNA Evolution and Genomics; RNA-Based Catalysis; RNA Interactions with Proteins and Other Molecules; Translation; RNA Processing; RNA Export/Localization; RNA Turnover and Surveillance; Regulatory RNAs/RNAi/Riboswitches; RNA in Disease and Development; and RNA Methods.