利用区域等位基因频率协助分析插入和缺失。

IF 3.9 4区 生物学 Q1 GENETICS & HEREDITY
Sarath Babu Krishna Murthy, Sandy Yang, Shiraz Bheda, Nikita Tomar, Haiyue Li, Amir Yaghoobi, Atlas Khan, Krzysztof Kiryluk, Joshua E. Motelow, Nick Ren, Ali G. Gharavi, Hila Milo Rasouly
{"title":"利用区域等位基因频率协助分析插入和缺失。","authors":"Sarath Babu Krishna Murthy,&nbsp;Sandy Yang,&nbsp;Shiraz Bheda,&nbsp;Nikita Tomar,&nbsp;Haiyue Li,&nbsp;Amir Yaghoobi,&nbsp;Atlas Khan,&nbsp;Krzysztof Kiryluk,&nbsp;Joshua E. Motelow,&nbsp;Nick Ren,&nbsp;Ali G. Gharavi,&nbsp;Hila Milo Rasouly","doi":"10.1007/s10142-024-01358-3","DOIUrl":null,"url":null,"abstract":"<div><p>Accurate estimation of population allele frequency (AF) is crucial for gene discovery and genetic diagnostics. However, determining AF for frameshift-inducing small insertions and deletions (indels) faces challenges due to discrepancies in mapping and variant calling methods. Here, we propose an innovative approach to assess indel AF. We developed CRAFTS-indels (Calculating Regional Allele Frequency Targeting Small indels), an algorithm that combines AF of distinct indels within a given region and provides “regional AF” (rAF). We tested and validated CRAFTS-indels using three independent datasets: gnomAD v2 (<i>n</i>=125,748 samples), an internal dataset (IGM; <i>n</i>=39,367), and the UK BioBank (UKBB; <i>n</i>=469,835). By comparing rAF against standard AF, we identified rare indels with rAF exceeding standard AF (sAF≤10<sup>-4</sup> and rAF&gt;10<sup>-4</sup>) as “rAF-hi” indels. Notably, a high percentage of rare indels were “rAF-hi”, with a higher proportion in gnomAD v2 (11-20%) and IGM (11-22%) compared to the UKBB (5-9% depending on the CRAFTS-indels’ parameters). Analysis of the overlap of regions based on their rAF with low complexity regions and with ClinVar classification supported the pertinence of rAF. Using the internal dataset, we illustrated the utility of CRAFTS-indel in the analysis of de novo variants and the potential negative impact of rAF-hi indels in gene discovery. In summary, annotation of indels with cohort specific rAF can be used to handle some of the limitations of current annotation pipelines and facilitate detection of novel gene disease associations. CRAFTS-indels offers a user-friendly approach to providing rAF annotation. It can be integrated into public databases such as gnomAD, UKBB and used by ClinVar to revise indel classifications.</p></div>","PeriodicalId":574,"journal":{"name":"Functional & Integrative Genomics","volume":null,"pages":null},"PeriodicalIF":3.9000,"publicationDate":"2024-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Assisting the analysis of insertions and deletions using regional allele frequencies\",\"authors\":\"Sarath Babu Krishna Murthy,&nbsp;Sandy Yang,&nbsp;Shiraz Bheda,&nbsp;Nikita Tomar,&nbsp;Haiyue Li,&nbsp;Amir Yaghoobi,&nbsp;Atlas Khan,&nbsp;Krzysztof Kiryluk,&nbsp;Joshua E. Motelow,&nbsp;Nick Ren,&nbsp;Ali G. Gharavi,&nbsp;Hila Milo Rasouly\",\"doi\":\"10.1007/s10142-024-01358-3\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Accurate estimation of population allele frequency (AF) is crucial for gene discovery and genetic diagnostics. However, determining AF for frameshift-inducing small insertions and deletions (indels) faces challenges due to discrepancies in mapping and variant calling methods. Here, we propose an innovative approach to assess indel AF. We developed CRAFTS-indels (Calculating Regional Allele Frequency Targeting Small indels), an algorithm that combines AF of distinct indels within a given region and provides “regional AF” (rAF). We tested and validated CRAFTS-indels using three independent datasets: gnomAD v2 (<i>n</i>=125,748 samples), an internal dataset (IGM; <i>n</i>=39,367), and the UK BioBank (UKBB; <i>n</i>=469,835). By comparing rAF against standard AF, we identified rare indels with rAF exceeding standard AF (sAF≤10<sup>-4</sup> and rAF&gt;10<sup>-4</sup>) as “rAF-hi” indels. Notably, a high percentage of rare indels were “rAF-hi”, with a higher proportion in gnomAD v2 (11-20%) and IGM (11-22%) compared to the UKBB (5-9% depending on the CRAFTS-indels’ parameters). Analysis of the overlap of regions based on their rAF with low complexity regions and with ClinVar classification supported the pertinence of rAF. Using the internal dataset, we illustrated the utility of CRAFTS-indel in the analysis of de novo variants and the potential negative impact of rAF-hi indels in gene discovery. In summary, annotation of indels with cohort specific rAF can be used to handle some of the limitations of current annotation pipelines and facilitate detection of novel gene disease associations. CRAFTS-indels offers a user-friendly approach to providing rAF annotation. It can be integrated into public databases such as gnomAD, UKBB and used by ClinVar to revise indel classifications.</p></div>\",\"PeriodicalId\":574,\"journal\":{\"name\":\"Functional & Integrative Genomics\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2024-05-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Functional & Integrative Genomics\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s10142-024-01358-3\",\"RegionNum\":4,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"GENETICS & HEREDITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Functional & Integrative Genomics","FirstCategoryId":"99","ListUrlMain":"https://link.springer.com/article/10.1007/s10142-024-01358-3","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0

摘要

准确估计群体等位基因频率(AF)对基因发现和遗传诊断至关重要。然而,由于制图和变异调用方法的差异,确定帧移诱导的小插入和缺失(indels)的等位基因频率面临挑战。在此,我们提出了一种评估吲哚AF的创新方法。我们开发了 CRAFTS-indels(以小吲哚为目标的区域等位基因频率计算),这是一种将给定区域内不同吲哚的等位基因频率结合起来并提供 "区域等位基因频率"(rAF)的算法。我们使用三个独立数据集测试并验证了 CRAFTS-indels:gnomAD v2(n=125,748 个样本)、内部数据集(IGM;n=39,367 个样本)和英国生物库(UKBB;n=469,835 个样本)。通过比较 rAF 与标准 AF,我们将 rAF 超过标准 AF(sAF≤10-4 和 rAF>10-4)的罕见吲哚识别为 "rAF-hi "吲哚。值得注意的是,"rAF-hi "稀有吲哚的比例很高,在 gnomAD v2(11-20%)和 IGM(11-22%)中的比例高于 UKBB(5-9%,取决于 CRAFTS-indels 参数)。基于 rAF 的区域与低复杂度区域和 ClinVar 分类的重叠分析支持了 rAF 的相关性。我们利用内部数据集说明了 CRAFTS-indel 在分析新变异中的实用性,以及 rAF-hi indels 对基因发现的潜在负面影响。总之,使用队列特异性 rAF 对indels 进行注释可以解决目前注释管道的一些局限性,并促进新型基因疾病关联的检测。CRAFTS-indels 为提供 rAF 注释提供了一种用户友好型方法。它可以集成到 gnomAD、UKBB 等公共数据库中,并被 ClinVar 用于修订吲哚分类。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Assisting the analysis of insertions and deletions using regional allele frequencies

Assisting the analysis of insertions and deletions using regional allele frequencies

Assisting the analysis of insertions and deletions using regional allele frequencies

Accurate estimation of population allele frequency (AF) is crucial for gene discovery and genetic diagnostics. However, determining AF for frameshift-inducing small insertions and deletions (indels) faces challenges due to discrepancies in mapping and variant calling methods. Here, we propose an innovative approach to assess indel AF. We developed CRAFTS-indels (Calculating Regional Allele Frequency Targeting Small indels), an algorithm that combines AF of distinct indels within a given region and provides “regional AF” (rAF). We tested and validated CRAFTS-indels using three independent datasets: gnomAD v2 (n=125,748 samples), an internal dataset (IGM; n=39,367), and the UK BioBank (UKBB; n=469,835). By comparing rAF against standard AF, we identified rare indels with rAF exceeding standard AF (sAF≤10-4 and rAF>10-4) as “rAF-hi” indels. Notably, a high percentage of rare indels were “rAF-hi”, with a higher proportion in gnomAD v2 (11-20%) and IGM (11-22%) compared to the UKBB (5-9% depending on the CRAFTS-indels’ parameters). Analysis of the overlap of regions based on their rAF with low complexity regions and with ClinVar classification supported the pertinence of rAF. Using the internal dataset, we illustrated the utility of CRAFTS-indel in the analysis of de novo variants and the potential negative impact of rAF-hi indels in gene discovery. In summary, annotation of indels with cohort specific rAF can be used to handle some of the limitations of current annotation pipelines and facilitate detection of novel gene disease associations. CRAFTS-indels offers a user-friendly approach to providing rAF annotation. It can be integrated into public databases such as gnomAD, UKBB and used by ClinVar to revise indel classifications.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
3.50
自引率
3.40%
发文量
92
审稿时长
2 months
期刊介绍: Functional & Integrative Genomics is devoted to large-scale studies of genomes and their functions, including systems analyses of biological processes. The journal will provide the research community an integrated platform where researchers can share, review and discuss their findings on important biological questions that will ultimately enable us to answer the fundamental question: How do genomes work?
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信