GWAS Procedures for Gene Mapping in Diverse Populations With Complex Structures.

IF 1 Q3 BIOLOGY

Bio-protocol Pub Date : 2025-04-20 DOI:10.21769/BioProtoc.5284

Zhen Zuo, Mingliang Li, Defu Liu, Qi Li, Bin Huang, Guanshi Ye, Jiabo Wang, You Tang, Zhiwu Zhang

{"title":"GWAS Procedures for Gene Mapping in Diverse Populations With Complex Structures.","authors":"Zhen Zuo, Mingliang Li, Defu Liu, Qi Li, Bin Huang, Guanshi Ye, Jiabo Wang, You Tang, Zhiwu Zhang","doi":"10.21769/BioProtoc.5284","DOIUrl":null,"url":null,"abstract":"<p><p>With reduced genotyping costs, genome-wide association studies (GWAS) face more challenges in diverse populations with complex structures to map genes of interest. The complex structure demands sophisticated statistical models, and increased marker density and population size require efficient computing tools. Many statistical models and computing tools have been developed with varied properties in statistical power, computing efficiency, and user-friendly accessibility. Some statistical models were developed with dedicated computing tools, such as efficient mixed model analysis (EMMA), multiple loci mixed model (MLMM), fixed and random model circulating probability unification (FarmCPU), and Bayesian-information and linkage-disequilibrium iteratively nested keyway (BLINK). However, there are computing tools (e.g., GAPIT) that implement multiple statistical models, retain a constant user interface, and maintain enhancement on input data and result interpretation. In this study, we developed a protocol utilizing a minimal set of software tools (BEAGLE, BLINK, and GAPIT) to perform a variety of analyses including file format conversion, missing genotype imputation, GWAS, and interpretation of input data and outcome results. We demonstrated the protocol by reanalyzing data from the Rice 3000 Genomes Project and highlighting advancements in GWAS model development.</p>","PeriodicalId":93907,"journal":{"name":"Bio-protocol","volume":"15 8","pages":"e5284"},"PeriodicalIF":1.0000,"publicationDate":"2025-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12021685/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bio-protocol","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21769/BioProtoc.5284","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"BIOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

With reduced genotyping costs, genome-wide association studies (GWAS) face more challenges in diverse populations with complex structures to map genes of interest. The complex structure demands sophisticated statistical models, and increased marker density and population size require efficient computing tools. Many statistical models and computing tools have been developed with varied properties in statistical power, computing efficiency, and user-friendly accessibility. Some statistical models were developed with dedicated computing tools, such as efficient mixed model analysis (EMMA), multiple loci mixed model (MLMM), fixed and random model circulating probability unification (FarmCPU), and Bayesian-information and linkage-disequilibrium iteratively nested keyway (BLINK). However, there are computing tools (e.g., GAPIT) that implement multiple statistical models, retain a constant user interface, and maintain enhancement on input data and result interpretation. In this study, we developed a protocol utilizing a minimal set of software tools (BEAGLE, BLINK, and GAPIT) to perform a variety of analyses including file format conversion, missing genotype imputation, GWAS, and interpretation of input data and outcome results. We demonstrated the protocol by reanalyzing data from the Rice 3000 Genomes Project and highlighting advancements in GWAS model development.

查看原文本刊更多论文

复杂结构不同群体基因定位的GWAS方法。

随着基因分型成本的降低，全基因组关联研究（GWAS）在具有复杂结构的不同人群中定位感兴趣基因面临更多挑战。复杂的结构需要复杂的统计模型，而增加的标记密度和种群规模需要高效的计算工具。已经开发了许多统计模型和计算工具，它们在统计能力、计算效率和用户友好访问性方面具有不同的特性。利用高效混合模型分析（EMMA）、多位点混合模型（MLMM）、固定与随机模型循环概率统一（FarmCPU）、贝叶斯信息与链接不平衡迭代嵌套键槽（BLINK）等统计工具建立了相应的统计模型。然而，有一些计算工具（例如，GAPIT）实现了多个统计模型，保留了恒定的用户界面，并对输入数据和结果解释进行了增强。在这项研究中，我们开发了一种方案，利用最小的软件工具集（BEAGLE， BLINK和GAPIT）来执行各种分析，包括文件格式转换，缺失基因型插入，GWAS以及输入数据和结果的解释。我们通过重新分析水稻3000基因组计划的数据并强调GWAS模型开发的进展来展示该协议。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Bio-protocol

CiteScore

1.50

自引率

0.00%

发文量