Resolving intra-repeat variation in medically relevant VNTRs from short-read sequencing data using the cardiovascular risk gene LPA as a model

IF 10.1 1区 生物学 Q1 BIOTECHNOLOGY & APPLIED MICROBIOLOGY
Silvia Di Maio, Peter Zöscher, Hansi Weissensteiner, Lukas Forer, Johanna F. Schachtl-Riess, Stephan Amstler, Gertraud Streiter, Cathrin Pfurtscheller, Bernhard Paulweber, Florian Kronenberg, Stefan Coassin, Sebastian Schönherr
{"title":"Resolving intra-repeat variation in medically relevant VNTRs from short-read sequencing data using the cardiovascular risk gene LPA as a model","authors":"Silvia Di Maio, Peter Zöscher, Hansi Weissensteiner, Lukas Forer, Johanna F. Schachtl-Riess, Stephan Amstler, Gertraud Streiter, Cathrin Pfurtscheller, Bernhard Paulweber, Florian Kronenberg, Stefan Coassin, Sebastian Schönherr","doi":"10.1186/s13059-024-03316-5","DOIUrl":null,"url":null,"abstract":"Variable number tandem repeats (VNTRs) are highly polymorphic DNA regions harboring many potentially disease-causing variants. However, VNTRs often appear unresolved (“dark”) in variation databases due to their repetitive nature. One particularly complex and medically relevant VNTR is the KIV-2 VNTR located in the cardiovascular disease gene LPA which encompasses up to 70% of the coding sequence. Using the highly complex LPA gene as a model, we develop a computational approach to resolve intra-repeat variation in VNTRs from largely available short-read sequencing data. We apply the approach to six protein-coding VNTRs in 2504 samples from the 1000 Genomes Project and developed an optimized method for the LPA KIV-2 VNTR that discriminates the confounding KIV-2 subtypes upfront. This results in an F1-score improvement of up to 2.1-fold compared to previously published strategies. Finally, we analyze the LPA VNTR in > 199,000 UK Biobank samples, detecting > 700 KIV-2 mutations. This approach successfully reveals new strong Lp(a)-lowering effects for KIV-2 variants, with protective effect against coronary artery disease, and also validated previous findings based on tagging SNPs. Our approach paves the way for reliable variant detection in VNTRs at scale and we show that it is transferable to other dark regions, which will help unlock medical information hidden in VNTRs.","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":null,"pages":null},"PeriodicalIF":10.1000,"publicationDate":"2024-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genome Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s13059-024-03316-5","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOTECHNOLOGY & APPLIED MICROBIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Variable number tandem repeats (VNTRs) are highly polymorphic DNA regions harboring many potentially disease-causing variants. However, VNTRs often appear unresolved (“dark”) in variation databases due to their repetitive nature. One particularly complex and medically relevant VNTR is the KIV-2 VNTR located in the cardiovascular disease gene LPA which encompasses up to 70% of the coding sequence. Using the highly complex LPA gene as a model, we develop a computational approach to resolve intra-repeat variation in VNTRs from largely available short-read sequencing data. We apply the approach to six protein-coding VNTRs in 2504 samples from the 1000 Genomes Project and developed an optimized method for the LPA KIV-2 VNTR that discriminates the confounding KIV-2 subtypes upfront. This results in an F1-score improvement of up to 2.1-fold compared to previously published strategies. Finally, we analyze the LPA VNTR in > 199,000 UK Biobank samples, detecting > 700 KIV-2 mutations. This approach successfully reveals new strong Lp(a)-lowering effects for KIV-2 variants, with protective effect against coronary artery disease, and also validated previous findings based on tagging SNPs. Our approach paves the way for reliable variant detection in VNTRs at scale and we show that it is transferable to other dark regions, which will help unlock medical information hidden in VNTRs.
以心血管风险基因 LPA 为模型,从短读数测序数据中解析医学相关 VNTR 的重复内变异
变数串联重复序列(VNTR)是高度多态的 DNA 区域,蕴藏着许多潜在的致病变异。然而,由于其重复性,VNTR 在变异数据库中往往是未解决的("暗")。心血管疾病基因 LPA 中的 KIV-2 VNTR 就是一个特别复杂且与医学相关的 VNTR,它包含了多达 70% 的编码序列。以高度复杂的 LPA 基因为模型,我们开发了一种计算方法,利用基本可用的短线程测序数据解析 VNTR 的重复内变异。我们将该方法应用于 1000 基因组计划 2504 个样本中的 6 个蛋白质编码 VNTR,并开发出了一种针对 LPA KIV-2 VNTR 的优化方法,该方法能预先分辨出 KIV-2 亚型。与之前发表的策略相比,该方法的 F1 分数提高了 2.1 倍。最后,我们分析了 > 199,000 份英国生物库样本中的 LPA VNTR,检测到 > 700 个 KIV-2 突变。这种方法成功揭示了KIV-2变异具有降低脂蛋白(a)的新强效应,对冠心病具有保护作用,同时也验证了之前基于标记SNPs的研究结果。我们的方法为大规模可靠地检测 VNTR 中的变异铺平了道路,我们还证明了这种方法可用于其他暗区,这将有助于揭示隐藏在 VNTR 中的医学信息。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Genome Biology
Genome Biology Biochemistry, Genetics and Molecular Biology-Genetics
CiteScore
21.00
自引率
3.30%
发文量
241
审稿时长
2 months
期刊介绍: Genome Biology stands as a premier platform for exceptional research across all domains of biology and biomedicine, explored through a genomic and post-genomic lens. With an impressive impact factor of 12.3 (2022),* the journal secures its position as the 3rd-ranked research journal in the Genetics and Heredity category and the 2nd-ranked research journal in the Biotechnology and Applied Microbiology category by Thomson Reuters. Notably, Genome Biology holds the distinction of being the highest-ranked open-access journal in this category. Our dedicated team of highly trained in-house Editors collaborates closely with our esteemed Editorial Board of international experts, ensuring the journal remains on the forefront of scientific advances and community standards. Regular engagement with researchers at conferences and institute visits underscores our commitment to staying abreast of the latest developments in the field.
文献相关原料
公司名称 产品信息 采购帮参考价格
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信