Chencheng Xu, Suying Bao, Ye Wang, Wenxing Li, Hao Chen, Yufeng Shen, Tao Jiang, Chaolin Zhang
{"title":"从序列中预测替代剪接和剪接改变突变的参考信息","authors":"Chencheng Xu, Suying Bao, Ye Wang, Wenxing Li, Hao Chen, Yufeng Shen, Tao Jiang, Chaolin Zhang","doi":"10.1101/gr.279044.124","DOIUrl":null,"url":null,"abstract":"Alternative splicing plays a crucial role in protein diversity and gene expression regulation in higher eukaryotes and mutations causing dysregulated splicing underlie a range of genetic diseases. Computational prediction of alternative splicing from genomic sequences not only provides insight into gene-regulatory mechanisms but also helps identify disease-causing mutations and drug targets. However, the current methods for the quantitative prediction of splice site usage still have limited accuracy. Here, we present DeltaSplice, a deep neural network model optimized to learn the impact of mutations on quantitative changes in alternative splicing from the comparative analysis of homologous genes. The model architecture enables DeltaSplice to perform \"reference-informed prediction\" by incorporating the known splice site usage of a reference gene sequence to improve its prediction on splicing-altering mutations. We benchmarked DeltaSplice and several other state-of-the-art methods on various prediction tasks, including evolutionary sequence divergence on lineage-specific splicing and splicing-altering mutations in human populations and neurodevelopmental disorders, and demonstrated that DeltaSplice outperformed consistently. DeltaSplice predicted ~15% of splicing quantitative trait loci (sQTLs) in the human brain as causal splicing-altering variants. It also predicted splicing-altering de novo mutations outside the splice sites in a subset of patients affected by autism and other neurodevelopmental disorders (NDD), including 19 genes with recurrent splicing-altering mutations. Integration of splicing-altering mutations with other types of denovo mutation burdens allowed prediction of eight novel NDD-risk genes. Our work expanded the capacity of in silico splicing models with potential applications in genetic diagnosis and the development of splicing-based precision medicine.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"79 1","pages":""},"PeriodicalIF":6.2000,"publicationDate":"2024-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Reference-informed prediction of alternative splicing and splicing-altering mutations from sequences\",\"authors\":\"Chencheng Xu, Suying Bao, Ye Wang, Wenxing Li, Hao Chen, Yufeng Shen, Tao Jiang, Chaolin Zhang\",\"doi\":\"10.1101/gr.279044.124\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Alternative splicing plays a crucial role in protein diversity and gene expression regulation in higher eukaryotes and mutations causing dysregulated splicing underlie a range of genetic diseases. Computational prediction of alternative splicing from genomic sequences not only provides insight into gene-regulatory mechanisms but also helps identify disease-causing mutations and drug targets. However, the current methods for the quantitative prediction of splice site usage still have limited accuracy. Here, we present DeltaSplice, a deep neural network model optimized to learn the impact of mutations on quantitative changes in alternative splicing from the comparative analysis of homologous genes. The model architecture enables DeltaSplice to perform \\\"reference-informed prediction\\\" by incorporating the known splice site usage of a reference gene sequence to improve its prediction on splicing-altering mutations. We benchmarked DeltaSplice and several other state-of-the-art methods on various prediction tasks, including evolutionary sequence divergence on lineage-specific splicing and splicing-altering mutations in human populations and neurodevelopmental disorders, and demonstrated that DeltaSplice outperformed consistently. DeltaSplice predicted ~15% of splicing quantitative trait loci (sQTLs) in the human brain as causal splicing-altering variants. It also predicted splicing-altering de novo mutations outside the splice sites in a subset of patients affected by autism and other neurodevelopmental disorders (NDD), including 19 genes with recurrent splicing-altering mutations. Integration of splicing-altering mutations with other types of denovo mutation burdens allowed prediction of eight novel NDD-risk genes. Our work expanded the capacity of in silico splicing models with potential applications in genetic diagnosis and the development of splicing-based precision medicine.\",\"PeriodicalId\":12678,\"journal\":{\"name\":\"Genome research\",\"volume\":\"79 1\",\"pages\":\"\"},\"PeriodicalIF\":6.2000,\"publicationDate\":\"2024-07-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Genome research\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1101/gr.279044.124\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOCHEMISTRY & MOLECULAR BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genome research","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1101/gr.279044.124","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
Reference-informed prediction of alternative splicing and splicing-altering mutations from sequences
Alternative splicing plays a crucial role in protein diversity and gene expression regulation in higher eukaryotes and mutations causing dysregulated splicing underlie a range of genetic diseases. Computational prediction of alternative splicing from genomic sequences not only provides insight into gene-regulatory mechanisms but also helps identify disease-causing mutations and drug targets. However, the current methods for the quantitative prediction of splice site usage still have limited accuracy. Here, we present DeltaSplice, a deep neural network model optimized to learn the impact of mutations on quantitative changes in alternative splicing from the comparative analysis of homologous genes. The model architecture enables DeltaSplice to perform "reference-informed prediction" by incorporating the known splice site usage of a reference gene sequence to improve its prediction on splicing-altering mutations. We benchmarked DeltaSplice and several other state-of-the-art methods on various prediction tasks, including evolutionary sequence divergence on lineage-specific splicing and splicing-altering mutations in human populations and neurodevelopmental disorders, and demonstrated that DeltaSplice outperformed consistently. DeltaSplice predicted ~15% of splicing quantitative trait loci (sQTLs) in the human brain as causal splicing-altering variants. It also predicted splicing-altering de novo mutations outside the splice sites in a subset of patients affected by autism and other neurodevelopmental disorders (NDD), including 19 genes with recurrent splicing-altering mutations. Integration of splicing-altering mutations with other types of denovo mutation burdens allowed prediction of eight novel NDD-risk genes. Our work expanded the capacity of in silico splicing models with potential applications in genetic diagnosis and the development of splicing-based precision medicine.
期刊介绍:
Launched in 1995, Genome Research is an international, continuously published, peer-reviewed journal that focuses on research that provides novel insights into the genome biology of all organisms, including advances in genomic medicine.
Among the topics considered by the journal are genome structure and function, comparative genomics, molecular evolution, genome-scale quantitative and population genetics, proteomics, epigenomics, and systems biology. The journal also features exciting gene discoveries and reports of cutting-edge computational biology and high-throughput methodologies.
New data in these areas are published as research papers, or methods and resource reports that provide novel information on technologies or tools that will be of interest to a broad readership. Complete data sets are presented electronically on the journal''s web site where appropriate. The journal also provides Reviews, Perspectives, and Insight/Outlook articles, which present commentary on the latest advances published both here and elsewhere, placing such progress in its broader biological context.