Combining full-length gene assay and SpliceAI to interpret the splicing impact of all possibleSPINK1coding variants

Hao Wu, Jin-Huan Lin, Xin-Ying Tang, Wen-Bin Zou, Sacha Schutz, Emmanuelle Masson, Yann Fichou, Gerald Le Gac, Claude Ferec, Zhuan Liao, Jian-Min Chen
{"title":"Combining full-length gene assay and SpliceAI to interpret the splicing impact of all possible<i>SPINK1</i>coding variants","authors":"Hao Wu, Jin-Huan Lin, Xin-Ying Tang, Wen-Bin Zou, Sacha Schutz, Emmanuelle Masson, Yann Fichou, Gerald Le Gac, Claude Ferec, Zhuan Liao, Jian-Min Chen","doi":"10.1101/2023.11.14.23298498","DOIUrl":null,"url":null,"abstract":"Background: Single-nucleotide variants (SNVs) within gene coding sequences can significantly impact pre-mRNA splicing, bearing profound implications for pathogenic mechanisms and precision medicine. However, reliable splicing analysis often faces practical limitations, especially when the relevant tissues are challenging to access. While in silico predictions are valuable, they alone do not meet clinical classification standards. In this study, we aim to harness the well-established full-length gene splicing assay (FLGSA) in conjunction with SpliceAI to prospectively interpret the splicing effects of all potential coding SNVs within the four-exon SPINK1 gene, a gene associated with chronic pancreatitis. Results: We initiated the study with a retrospective correlation analysis (involving 27 previously FLGSA-analyzed SPINK1 coding SNVs), progressed to a prospective correlation analysis (incorporating 35 newly FLGSA-tested SPINK1 coding SNVs), followed by data extrapolation, and ended with further validation. In total, we analyzed 67 SPINK1 coding SNVs, representing 9.3% of all 720 possible coding SNVs and affecting 19.2% of the 240 coding nucleotides. Among these 67 FLGSA-analyzed SNVs, 12 were found to impact splicing. Through extensive cross-correlation of the FLGSA-obtained and SpliceAI-predicted data, we reasonably extrapolated that none of the unanalyzed 653 coding SNVs in the SPINK1 gene are likely to exert a significant effect on splicing. Out of these 12 splice-altering events, nine produced both wild-type and aberrant transcripts, while the remaining three exclusively generated aberrant transcripts. These splice-altering SNVs were predominantly concentrated in exons 1 and 2, particularly affecting the first and/or last coding nucleotide of each exon. Among the 12 splice-altering events, 11 were missense variants, constituting 2.17% of the 506 potential missense variants, while one was synonymous, accounting for 0.61% of the 164 potential synonymous variants. Conclusions: Integrating FLGSA with SpliceAI, we conclude that less than 2% (1.67%) of all possible SPINK1 coding SNVs have a discernible influence on splicing outcomes. Our findings underscore the importance of performing splicing analysis in the broader genomic sequence context of the study gene, highlight the inherent uncertainties associated with intermediate SpliceAI scores (i.e., those ranging from 0.20 to 0.80), and have general implications for the shift from \"retrospective\" to \"prospective\" analysis in terms of variant classification.","PeriodicalId":478577,"journal":{"name":"medRxiv (Cold Spring Harbor Laboratory)","volume":"8 3","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"medRxiv (Cold Spring Harbor Laboratory)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2023.11.14.23298498","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Single-nucleotide variants (SNVs) within gene coding sequences can significantly impact pre-mRNA splicing, bearing profound implications for pathogenic mechanisms and precision medicine. However, reliable splicing analysis often faces practical limitations, especially when the relevant tissues are challenging to access. While in silico predictions are valuable, they alone do not meet clinical classification standards. In this study, we aim to harness the well-established full-length gene splicing assay (FLGSA) in conjunction with SpliceAI to prospectively interpret the splicing effects of all potential coding SNVs within the four-exon SPINK1 gene, a gene associated with chronic pancreatitis. Results: We initiated the study with a retrospective correlation analysis (involving 27 previously FLGSA-analyzed SPINK1 coding SNVs), progressed to a prospective correlation analysis (incorporating 35 newly FLGSA-tested SPINK1 coding SNVs), followed by data extrapolation, and ended with further validation. In total, we analyzed 67 SPINK1 coding SNVs, representing 9.3% of all 720 possible coding SNVs and affecting 19.2% of the 240 coding nucleotides. Among these 67 FLGSA-analyzed SNVs, 12 were found to impact splicing. Through extensive cross-correlation of the FLGSA-obtained and SpliceAI-predicted data, we reasonably extrapolated that none of the unanalyzed 653 coding SNVs in the SPINK1 gene are likely to exert a significant effect on splicing. Out of these 12 splice-altering events, nine produced both wild-type and aberrant transcripts, while the remaining three exclusively generated aberrant transcripts. These splice-altering SNVs were predominantly concentrated in exons 1 and 2, particularly affecting the first and/or last coding nucleotide of each exon. Among the 12 splice-altering events, 11 were missense variants, constituting 2.17% of the 506 potential missense variants, while one was synonymous, accounting for 0.61% of the 164 potential synonymous variants. Conclusions: Integrating FLGSA with SpliceAI, we conclude that less than 2% (1.67%) of all possible SPINK1 coding SNVs have a discernible influence on splicing outcomes. Our findings underscore the importance of performing splicing analysis in the broader genomic sequence context of the study gene, highlight the inherent uncertainties associated with intermediate SpliceAI scores (i.e., those ranging from 0.20 to 0.80), and have general implications for the shift from "retrospective" to "prospective" analysis in terms of variant classification.
结合全长基因分析和SpliceAI分析所有可能的espink1编码变异对剪接的影响
背景:基因编码序列中的单核苷酸变异(SNVs)可以显著影响前mrna剪接,对致病机制和精准医学具有深远的意义。然而,可靠的剪接分析常常面临实际的限制,特别是当相关组织难以获取时。虽然计算机预测是有价值的,但它们本身并不符合临床分类标准。在这项研究中,我们的目标是利用成熟的全长基因剪接实验(FLGSA)和SpliceAI来前瞻性地解释四外显子SPINK1基因(一个与慢性胰腺炎相关的基因)内所有潜在编码snv的剪接作用。结果:我们首先进行了回顾性相关分析(包括27个先前flgsa分析的SPINK1编码snv),然后进行了前瞻性相关分析(包括35个新flgsa测试的SPINK1编码snv),然后进行了数据外推,并以进一步验证结束。我们总共分析了67个SPINK1编码snv,占所有720个可能编码snv的9.3%,影响240个编码核苷酸的19.2%。在这67个flgsa分析的snv中,发现12个影响剪接。通过对flgsa获得的数据和spliceai预测的数据进行广泛的相互关联,我们合理地推断,SPINK1基因中未分析的653个编码snv中,没有一个可能对剪接产生显著影响。在这12个剪接改变事件中,9个同时产生野生型和异常转录本,而其余3个只产生异常转录本。这些改变剪接的snv主要集中在外显子1和2上,特别影响每个外显子的第一个和/或最后一个编码核苷酸。12个剪接改变事件中,11个为错义变异,占506个潜在错义变异的2.17%;1个为同义变异,占164个潜在同义变异的0.61%。结论:结合FLGSA和SpliceAI,我们得出结论,不到2%(1.67%)的可能的SPINK1编码snv对剪接结果有明显的影响。我们的研究结果强调了在研究基因的更广泛的基因组序列背景下进行剪接分析的重要性,强调了与中间SpliceAI分数(即0.20到0.80之间的分数)相关的固有不确定性,并对在变异分类方面从“回顾性”分析转向“前瞻性”分析具有一般意义。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信