PRED-LD: GWAS汇总统计数据的有效输入。

IF 2.9 3区生物学 Q2 BIOCHEMICAL RESEARCH METHODS

BMC Bioinformatics Pub Date : 2025-04-16 DOI:10.1186/s12859-025-06119-y

Georgios A Manios, Aikaterini Michailidi, Panagiota I Kontou, Pantelis G Bagos

{"title":"PRED-LD: GWAS汇总统计数据的有效输入。","authors":"Georgios A Manios, Aikaterini Michailidi, Panagiota I Kontou, Pantelis G Bagos","doi":"10.1186/s12859-025-06119-y","DOIUrl":null,"url":null,"abstract":"Background: Genome-wide association studies have identified connections between genetic variations and diseases, but they only examine a small portion of single nucleotide polymorphisms. To enhance genetic findings, researchers suggest imputing genotypes for unmeasured SNPs to improve coverage and statistical power. When this is not possible, summary statistics imputation can be used as an alternative. The available summary statistics imputation tools rely on reference panels, such as the 1000 Genomes Project, to estimate linkage disequilibrium (LD) between variants for accurate imputation. Tools like FAPI and SSIMP use these reference panels in variant call format (VCF) for this purpose, though this process can be time-consuming. A more effective approach for processing reference panels in summary statistics imputation was proposed in RAISS. In this approach, the LD among the variants is precomputed from the reference panel, prior to imputation, thereby reducing computational time.Results: We present PRED-LD, an imputation method for GWAS summary statistics that aims to enhance the resolution of genetic association analyses. The proposed method uses precomputed linkage disequilibrium statistics from HapMap, Pheno Scanner and TOP-LD to impute summary statistics, given beta coefficients and standard errors. The single-point approach that we describe provides a fast and accurate way to estimate associations for untyped single nucleotide polymorphisms that exhibit high linkage disequilibrium (LD). The proposed method is faster, provides accurate imputation compared to existing tools, and has been implemented in both a web service ( https://compgen.dib.uth.gr/PRED-LD/ ) and a command-line tool ( https://github.com/pbagos/PRED-LD ), making it a useful resource for the research community.Conclusions: PRED-LD offers an efficient and accurate method for GWAS summary statistics imputation, providing faster performance, direct result interpretation, and the ability to use multiple reference panels. Also, the online version of PRED-LD simplifies obtaining LD information and performing imputation tasks without downloading reference panels and will be continuously updated to support tools for meta-analysis and fine-mapping in GWAS.","PeriodicalId":8958,"journal":{"name":"BMC Bioinformatics","volume":"26 1","pages":"107"},"PeriodicalIF":2.9000,"publicationDate":"2025-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12004831/pdf/","citationCount":"0","resultStr":"{\"title\":\"PRED-LD: efficient imputation of GWAS summary statistics.\",\"authors\":\"Georgios A Manios, Aikaterini Michailidi, Panagiota I Kontou, Pantelis G Bagos\",\"doi\":\"10.1186/s12859-025-06119-y\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background: Genome-wide association studies have identified connections between genetic variations and diseases, but they only examine a small portion of single nucleotide polymorphisms. To enhance genetic findings, researchers suggest imputing genotypes for unmeasured SNPs to improve coverage and statistical power. When this is not possible, summary statistics imputation can be used as an alternative. The available summary statistics imputation tools rely on reference panels, such as the 1000 Genomes Project, to estimate linkage disequilibrium (LD) between variants for accurate imputation. Tools like FAPI and SSIMP use these reference panels in variant call format (VCF) for this purpose, though this process can be time-consuming. A more effective approach for processing reference panels in summary statistics imputation was proposed in RAISS. In this approach, the LD among the variants is precomputed from the reference panel, prior to imputation, thereby reducing computational time.Results: We present PRED-LD, an imputation method for GWAS summary statistics that aims to enhance the resolution of genetic association analyses. The proposed method uses precomputed linkage disequilibrium statistics from HapMap, Pheno Scanner and TOP-LD to impute summary statistics, given beta coefficients and standard errors. The single-point approach that we describe provides a fast and accurate way to estimate associations for untyped single nucleotide polymorphisms that exhibit high linkage disequilibrium (LD). The proposed method is faster, provides accurate imputation compared to existing tools, and has been implemented in both a web service ( https://compgen.dib.uth.gr/PRED-LD/ ) and a command-line tool ( https://github.com/pbagos/PRED-LD ), making it a useful resource for the research community.Conclusions: PRED-LD offers an efficient and accurate method for GWAS summary statistics imputation, providing faster performance, direct result interpretation, and the ability to use multiple reference panels. Also, the online version of PRED-LD simplifies obtaining LD information and performing imputation tasks without downloading reference panels and will be continuously updated to support tools for meta-analysis and fine-mapping in GWAS.\",\"PeriodicalId\":8958,\"journal\":{\"name\":\"BMC Bioinformatics\",\"volume\":\"26 1\",\"pages\":\"107\"},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2025-04-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12004831/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"BMC Bioinformatics\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1186/s12859-025-06119-y\",\"RegionNum\":3,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"BIOCHEMICAL RESEARCH METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s12859-025-06119-y","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}

引用次数: 0

摘要

背景：全基因组关联研究已经确定了遗传变异和疾病之间的联系，但它们只检查了一小部分单核苷酸多态性。为了加强遗传学发现，研究人员建议为未测量的snp输入基因型，以提高覆盖范围和统计能力。如果无法做到这一点，则可以使用汇总统计推算作为替代方法。现有的汇总统计代入工具依赖于参考面板，如1000基因组计划，来估计变异之间的连锁不平衡（LD），以进行准确的代入。FAPI和SSIMP等工具为此目的以可变调用格式（VCF）使用这些参考面板，尽管这个过程可能很耗时。在RAISS中，提出了一种更有效的汇总统计输入参考面板处理方法。在这种方法中，变体之间的LD是在输入之前从参考面板中预先计算出来的，从而减少了计算时间。结果：我们提出了一种用于GWAS汇总统计的PRED-LD方法，旨在提高遗传关联分析的分辨率。该方法使用来自HapMap、Pheno Scanner和TOP-LD的预计算连锁不平衡统计量来估算汇总统计量，给定beta系数和标准误差。我们描述的单点方法提供了一种快速准确的方法来估计表现出高连锁不平衡（LD）的未分型单核苷酸多态性的关联。与现有工具相比，所提出的方法速度更快，提供了准确的输入，并且已经在web服务（https://compgen.dib.uth.gr/PRED-LD/）和命令行工具（https://github.com/pbagos/PRED-LD）中实现，使其成为研究社区的有用资源。结论：PRED-LD提供了一种高效、准确的GWAS汇总统计输入方法，具有更快的性能、直接的结果解释和使用多个参考面板的能力。此外，PRED-LD的在线版本简化了获取LD信息和执行插入任务，而无需下载参考面板，并且将不断更新以支持GWAS中的元分析和精细映射工具。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

PRED-LD: efficient imputation of GWAS summary statistics.

Background: Genome-wide association studies have identified connections between genetic variations and diseases, but they only examine a small portion of single nucleotide polymorphisms. To enhance genetic findings, researchers suggest imputing genotypes for unmeasured SNPs to improve coverage and statistical power. When this is not possible, summary statistics imputation can be used as an alternative. The available summary statistics imputation tools rely on reference panels, such as the 1000 Genomes Project, to estimate linkage disequilibrium (LD) between variants for accurate imputation. Tools like FAPI and SSIMP use these reference panels in variant call format (VCF) for this purpose, though this process can be time-consuming. A more effective approach for processing reference panels in summary statistics imputation was proposed in RAISS. In this approach, the LD among the variants is precomputed from the reference panel, prior to imputation, thereby reducing computational time.

Results: We present PRED-LD, an imputation method for GWAS summary statistics that aims to enhance the resolution of genetic association analyses. The proposed method uses precomputed linkage disequilibrium statistics from HapMap, Pheno Scanner and TOP-LD to impute summary statistics, given beta coefficients and standard errors. The single-point approach that we describe provides a fast and accurate way to estimate associations for untyped single nucleotide polymorphisms that exhibit high linkage disequilibrium (LD). The proposed method is faster, provides accurate imputation compared to existing tools, and has been implemented in both a web service ( https://compgen.dib.uth.gr/PRED-LD/ ) and a command-line tool ( https://github.com/pbagos/PRED-LD ), making it a useful resource for the research community.

Conclusions: PRED-LD offers an efficient and accurate method for GWAS summary statistics imputation, providing faster performance, direct result interpretation, and the ability to use multiple reference panels. Also, the online version of PRED-LD simplifies obtaining LD information and performing imputation tasks without downloading reference panels and will be continuously updated to support tools for meta-analysis and fine-mapping in GWAS.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

BMC Bioinformatics 生物-生化研究方法

CiteScore

5.70

自引率

3.30%

发文量

506

审稿时长

4.3 months

期刊介绍： BMC Bioinformatics is an open access, peer-reviewed journal that considers articles on all aspects of the development, testing and novel application of computational and statistical methods for the modeling and analysis of all kinds of biological data, as well as other areas of computational biology. BMC Bioinformatics is part of the BMC series which publishes subject-specific journals focused on the needs of individual research communities across all areas of biology and medicine. We offer an efficient, fair and friendly peer review service, and are committed to publishing all sound science, provided that there is some advance in knowledge presented by the work.