基因组参考和群体数据集对遗传性视网膜疾病致病变异的注释和优先排序的重要性。

IF 1 4区 医学 Q4 GENETICS & HEREDITY
Stefan T Stafie, Mark Lindquist, Samuel Kusher-Lenhoff, Kenji Nakamichi, Debarshi Mustafi
{"title":"基因组参考和群体数据集对遗传性视网膜疾病致病变异的注释和优先排序的重要性。","authors":"Stefan T Stafie, Mark Lindquist, Samuel Kusher-Lenhoff, Kenji Nakamichi, Debarshi Mustafi","doi":"10.1080/13816810.2025.2544639","DOIUrl":null,"url":null,"abstract":"<p><p>In an era of expanding sequencing technologies, increased variant identification requires assignment of potential functional impact to prioritize those that may be disease-causing. In this data note, we demonstrate the importance of using a refined human genome reference assembly and more diverse and curated population-based databases in guiding functional annotation of variants identified in inherited retinal disease (IRD) genes. We compared variant characteristics extracted from Genome Aggregation Database (gnomAD) population data extracted for 372 IRD disease genes from versions 3.1.2 (v3) and 4.1.0 (v4), which are aligned to the most recent Genome Reference Consortium Human Build 38 (GRCh38) as well as version 2.1.1 (v2), aligned to the previous GRCh37 build. Transformation of the Variant Effector Prediction (VEP), Combined Annotation Dependent Depletion (CADD) scores, and ClinVar pathogenicity annotations were used to generate receiver-operating characteristic (ROC) curves to calculate area under the curve (AUC) and area under the precision-recall curve (AUPRC). Comparisons of variant prediction by ClinVar designation showed that with improved functional annotation, the AUC climbs to 0.99 and AUPRC is 0.98 in differentiating pathogenic variants from nonpathogenic when using the most recent genome build and population database. More diverse population data allow for identification of rare variants and the incorporation of variant annotation metrics provides greater insight into pathogenicity parameters of IRD variants. This data note provides empirical evidence to adopt the newest genomic builds and databases to better prioritize variants as potentially disease-causing for more complete molecular diagnosis in IRD patients.</p>","PeriodicalId":19594,"journal":{"name":"Ophthalmic Genetics","volume":" ","pages":"1-7"},"PeriodicalIF":1.0000,"publicationDate":"2025-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Importance of genome reference and population datasets for annotation and prioritization of disease-causing variants in inherited retinal diseases.\",\"authors\":\"Stefan T Stafie, Mark Lindquist, Samuel Kusher-Lenhoff, Kenji Nakamichi, Debarshi Mustafi\",\"doi\":\"10.1080/13816810.2025.2544639\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>In an era of expanding sequencing technologies, increased variant identification requires assignment of potential functional impact to prioritize those that may be disease-causing. In this data note, we demonstrate the importance of using a refined human genome reference assembly and more diverse and curated population-based databases in guiding functional annotation of variants identified in inherited retinal disease (IRD) genes. We compared variant characteristics extracted from Genome Aggregation Database (gnomAD) population data extracted for 372 IRD disease genes from versions 3.1.2 (v3) and 4.1.0 (v4), which are aligned to the most recent Genome Reference Consortium Human Build 38 (GRCh38) as well as version 2.1.1 (v2), aligned to the previous GRCh37 build. Transformation of the Variant Effector Prediction (VEP), Combined Annotation Dependent Depletion (CADD) scores, and ClinVar pathogenicity annotations were used to generate receiver-operating characteristic (ROC) curves to calculate area under the curve (AUC) and area under the precision-recall curve (AUPRC). Comparisons of variant prediction by ClinVar designation showed that with improved functional annotation, the AUC climbs to 0.99 and AUPRC is 0.98 in differentiating pathogenic variants from nonpathogenic when using the most recent genome build and population database. More diverse population data allow for identification of rare variants and the incorporation of variant annotation metrics provides greater insight into pathogenicity parameters of IRD variants. This data note provides empirical evidence to adopt the newest genomic builds and databases to better prioritize variants as potentially disease-causing for more complete molecular diagnosis in IRD patients.</p>\",\"PeriodicalId\":19594,\"journal\":{\"name\":\"Ophthalmic Genetics\",\"volume\":\" \",\"pages\":\"1-7\"},\"PeriodicalIF\":1.0000,\"publicationDate\":\"2025-08-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Ophthalmic Genetics\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1080/13816810.2025.2544639\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"GENETICS & HEREDITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ophthalmic Genetics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1080/13816810.2025.2544639","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0

摘要

在测序技术不断发展的时代,越来越多的变异鉴定需要分配潜在的功能影响,以优先考虑那些可能引起疾病的变异。在这个数据记录中,我们证明了使用一个完善的人类基因组参考组合和更多样化和精心策划的基于人群的数据库在指导遗传性视网膜疾病(IRD)基因变异的功能注释中的重要性。我们比较了从基因组聚集数据库(gnomAD)群体数据中提取的372个IRD疾病基因的变异特征,其中3.1.2 (v3)和4.1.0 (v4)版本与最新的基因组参考联盟人类构建38 (GRCh38)一致,2.1.1 (v2)版本与之前的GRCh37构建一致。采用变异效应预测(VEP)转换、注释依赖消耗(CADD)组合评分和ClinVar致病性注释生成受试者工作特征(ROC)曲线,计算曲线下面积(AUC)和精确召回曲线下面积(AUPRC)。通过ClinVar标记的变异预测比较表明,当使用最新的基因组构建和种群数据库时,经过改进的功能注释,在区分致病性变异和非致病性变异时,AUC攀升至0.99,AUPRC为0.98。更多样化的种群数据可以识别罕见的变异,而变异注释指标的结合可以更深入地了解IRD变异的致病性参数。该数据说明为采用最新的基因组构建和数据库更好地优先考虑变异作为潜在致病因素,从而对IRD患者进行更完整的分子诊断提供了经验证据。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Importance of genome reference and population datasets for annotation and prioritization of disease-causing variants in inherited retinal diseases.

In an era of expanding sequencing technologies, increased variant identification requires assignment of potential functional impact to prioritize those that may be disease-causing. In this data note, we demonstrate the importance of using a refined human genome reference assembly and more diverse and curated population-based databases in guiding functional annotation of variants identified in inherited retinal disease (IRD) genes. We compared variant characteristics extracted from Genome Aggregation Database (gnomAD) population data extracted for 372 IRD disease genes from versions 3.1.2 (v3) and 4.1.0 (v4), which are aligned to the most recent Genome Reference Consortium Human Build 38 (GRCh38) as well as version 2.1.1 (v2), aligned to the previous GRCh37 build. Transformation of the Variant Effector Prediction (VEP), Combined Annotation Dependent Depletion (CADD) scores, and ClinVar pathogenicity annotations were used to generate receiver-operating characteristic (ROC) curves to calculate area under the curve (AUC) and area under the precision-recall curve (AUPRC). Comparisons of variant prediction by ClinVar designation showed that with improved functional annotation, the AUC climbs to 0.99 and AUPRC is 0.98 in differentiating pathogenic variants from nonpathogenic when using the most recent genome build and population database. More diverse population data allow for identification of rare variants and the incorporation of variant annotation metrics provides greater insight into pathogenicity parameters of IRD variants. This data note provides empirical evidence to adopt the newest genomic builds and databases to better prioritize variants as potentially disease-causing for more complete molecular diagnosis in IRD patients.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Ophthalmic Genetics
Ophthalmic Genetics 医学-眼科学
CiteScore
2.40
自引率
8.30%
发文量
126
审稿时长
>12 weeks
期刊介绍: Ophthalmic Genetics accepts original papers, review articles and short communications on the clinical and molecular genetic aspects of ocular diseases.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信