Robert Lesurf, Jeroen Breckpot, Jade Bouwmeester, Nour Hanafi, Anjali Jain, Yijing Liang, Tanya Papaz, Jane Lougheed, Tapas Mondal, Mahmoud Alsalehi, Luis Altamirano-Diaz, Erwin Oechslin, Enrique Audain, Gregor Dombrowsky, Alex V Postma, Odilia I Woudstra, Berto J Bouma, Marc-Phillip Hitz, Connie R Bezzina, Gillian M Blue, David S Winlaw, Seema Mital
{"title":"儿童心脏病中剪接干扰变异的特异性心脏模型经过验证。","authors":"Robert Lesurf, Jeroen Breckpot, Jade Bouwmeester, Nour Hanafi, Anjali Jain, Yijing Liang, Tanya Papaz, Jane Lougheed, Tapas Mondal, Mahmoud Alsalehi, Luis Altamirano-Diaz, Erwin Oechslin, Enrique Audain, Gregor Dombrowsky, Alex V Postma, Odilia I Woudstra, Berto J Bouma, Marc-Phillip Hitz, Connie R Bezzina, Gillian M Blue, David S Winlaw, Seema Mital","doi":"10.1186/s13073-024-01383-8","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Congenital heart disease (CHD) is the most common congenital anomaly. Almost 90% of isolated cases have an unexplained genetic etiology after clinical testing. Non-canonical splice variants that disrupt mRNA splicing through the loss or creation of exon boundaries are not routinely captured and/or evaluated by standard clinical genetic tests. Recent computational algorithms such as SpliceAI have shown an ability to predict such variants, but are not specific to cardiac-expressed genes and transcriptional isoforms.</p><p><strong>Methods: </strong>We used genome sequencing (GS) (n = 1101 CHD probands) and myocardial RNA-Sequencing (RNA-Seq) (n = 154 CHD and n = 43 cardiomyopathy probands) to identify and validate splice disrupting variants, and to develop a heart-specific model for canonical and non-canonical splice variants that can be applied to patients with CHD and cardiomyopathy. Two thousand five hundred seventy GS samples from the Medical Genome Reference Bank were analyzed as healthy controls.</p><p><strong>Results: </strong>Of 8583 rare DNA splice-disrupting variants initially identified using SpliceAI, 100 were associated with altered splice junctions in the corresponding patient myocardium affecting 95 genes. Using strength of myocardial gene expression and genome-wide DNA variant features that were confirmed to affect splicing in myocardial RNA, we trained a machine learning model for predicting cardiac-specific splice-disrupting variants (AUC 0.86 on internal validation). In a validation set of 48 CHD probands, the cardiac-specific model outperformed a SpliceAI model alone (AUC 0.94 vs 0.67 respectively). Application of this model to an additional 947 CHD probands with only GS data identified 1% patients with canonical and 11% patients with non-canonical splice-disrupting variants in CHD genes. Forty-nine percent of predicted splice-disrupting variants were intronic and > 10 bp from existing splice junctions. The burden of high-confidence splice-disrupting variants in CHD genes was 1.28-fold higher in CHD cases compared with healthy controls.</p><p><strong>Conclusions: </strong>A new cardiac-specific in silico model was developed using complementary GS and RNA-Seq data that improved genetic yield by identifying a significant burden of non-canonical splice variants associated with CHD that would not be detectable through panel or exome sequencing.</p>","PeriodicalId":12645,"journal":{"name":"Genome Medicine","volume":"16 1","pages":"119"},"PeriodicalIF":10.4000,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11476204/pdf/","citationCount":"0","resultStr":"{\"title\":\"A validated heart-specific model for splice-disrupting variants in childhood heart disease.\",\"authors\":\"Robert Lesurf, Jeroen Breckpot, Jade Bouwmeester, Nour Hanafi, Anjali Jain, Yijing Liang, Tanya Papaz, Jane Lougheed, Tapas Mondal, Mahmoud Alsalehi, Luis Altamirano-Diaz, Erwin Oechslin, Enrique Audain, Gregor Dombrowsky, Alex V Postma, Odilia I Woudstra, Berto J Bouma, Marc-Phillip Hitz, Connie R Bezzina, Gillian M Blue, David S Winlaw, Seema Mital\",\"doi\":\"10.1186/s13073-024-01383-8\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Congenital heart disease (CHD) is the most common congenital anomaly. Almost 90% of isolated cases have an unexplained genetic etiology after clinical testing. Non-canonical splice variants that disrupt mRNA splicing through the loss or creation of exon boundaries are not routinely captured and/or evaluated by standard clinical genetic tests. Recent computational algorithms such as SpliceAI have shown an ability to predict such variants, but are not specific to cardiac-expressed genes and transcriptional isoforms.</p><p><strong>Methods: </strong>We used genome sequencing (GS) (n = 1101 CHD probands) and myocardial RNA-Sequencing (RNA-Seq) (n = 154 CHD and n = 43 cardiomyopathy probands) to identify and validate splice disrupting variants, and to develop a heart-specific model for canonical and non-canonical splice variants that can be applied to patients with CHD and cardiomyopathy. Two thousand five hundred seventy GS samples from the Medical Genome Reference Bank were analyzed as healthy controls.</p><p><strong>Results: </strong>Of 8583 rare DNA splice-disrupting variants initially identified using SpliceAI, 100 were associated with altered splice junctions in the corresponding patient myocardium affecting 95 genes. Using strength of myocardial gene expression and genome-wide DNA variant features that were confirmed to affect splicing in myocardial RNA, we trained a machine learning model for predicting cardiac-specific splice-disrupting variants (AUC 0.86 on internal validation). In a validation set of 48 CHD probands, the cardiac-specific model outperformed a SpliceAI model alone (AUC 0.94 vs 0.67 respectively). Application of this model to an additional 947 CHD probands with only GS data identified 1% patients with canonical and 11% patients with non-canonical splice-disrupting variants in CHD genes. Forty-nine percent of predicted splice-disrupting variants were intronic and > 10 bp from existing splice junctions. The burden of high-confidence splice-disrupting variants in CHD genes was 1.28-fold higher in CHD cases compared with healthy controls.</p><p><strong>Conclusions: </strong>A new cardiac-specific in silico model was developed using complementary GS and RNA-Seq data that improved genetic yield by identifying a significant burden of non-canonical splice variants associated with CHD that would not be detectable through panel or exome sequencing.</p>\",\"PeriodicalId\":12645,\"journal\":{\"name\":\"Genome Medicine\",\"volume\":\"16 1\",\"pages\":\"119\"},\"PeriodicalIF\":10.4000,\"publicationDate\":\"2024-10-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11476204/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Genome Medicine\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1186/s13073-024-01383-8\",\"RegionNum\":1,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"GENETICS & HEREDITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genome Medicine","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s13073-024-01383-8","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
A validated heart-specific model for splice-disrupting variants in childhood heart disease.
Background: Congenital heart disease (CHD) is the most common congenital anomaly. Almost 90% of isolated cases have an unexplained genetic etiology after clinical testing. Non-canonical splice variants that disrupt mRNA splicing through the loss or creation of exon boundaries are not routinely captured and/or evaluated by standard clinical genetic tests. Recent computational algorithms such as SpliceAI have shown an ability to predict such variants, but are not specific to cardiac-expressed genes and transcriptional isoforms.
Methods: We used genome sequencing (GS) (n = 1101 CHD probands) and myocardial RNA-Sequencing (RNA-Seq) (n = 154 CHD and n = 43 cardiomyopathy probands) to identify and validate splice disrupting variants, and to develop a heart-specific model for canonical and non-canonical splice variants that can be applied to patients with CHD and cardiomyopathy. Two thousand five hundred seventy GS samples from the Medical Genome Reference Bank were analyzed as healthy controls.
Results: Of 8583 rare DNA splice-disrupting variants initially identified using SpliceAI, 100 were associated with altered splice junctions in the corresponding patient myocardium affecting 95 genes. Using strength of myocardial gene expression and genome-wide DNA variant features that were confirmed to affect splicing in myocardial RNA, we trained a machine learning model for predicting cardiac-specific splice-disrupting variants (AUC 0.86 on internal validation). In a validation set of 48 CHD probands, the cardiac-specific model outperformed a SpliceAI model alone (AUC 0.94 vs 0.67 respectively). Application of this model to an additional 947 CHD probands with only GS data identified 1% patients with canonical and 11% patients with non-canonical splice-disrupting variants in CHD genes. Forty-nine percent of predicted splice-disrupting variants were intronic and > 10 bp from existing splice junctions. The burden of high-confidence splice-disrupting variants in CHD genes was 1.28-fold higher in CHD cases compared with healthy controls.
Conclusions: A new cardiac-specific in silico model was developed using complementary GS and RNA-Seq data that improved genetic yield by identifying a significant burden of non-canonical splice variants associated with CHD that would not be detectable through panel or exome sequencing.
期刊介绍:
Genome Medicine is an open access journal that publishes outstanding research applying genetics, genomics, and multi-omics to understand, diagnose, and treat disease. Bridging basic science and clinical research, it covers areas such as cancer genomics, immuno-oncology, immunogenomics, infectious disease, microbiome, neurogenomics, systems medicine, clinical genomics, gene therapies, precision medicine, and clinical trials. The journal publishes original research, methods, software, and reviews to serve authors and promote broad interest and importance in the field.