PopCluster：一个基于群体遗传学模型的工具集，用于模拟、推断和可视化个体混合和群体结构。

IF 5.5 1区生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY

Molecular Ecology Resources Pub Date : 2024-12-26 DOI:10.1111/1755-0998.14058

Jinliang Wang

{"title":"PopCluster：一个基于群体遗传学模型的工具集，用于模拟、推断和可视化个体混合和群体结构。","authors":"Jinliang Wang","doi":"10.1111/1755-0998.14058","DOIUrl":null,"url":null,"abstract":"In this computer note I introduce software, PopCluster, that implements a new likelihood method for unsupervised population structure analysis from marker data. To infer a coarse population structure, it assumes the mixture model and adopts a simulated annealing algorithm to make a maximum likelihood clustering analysis, partitioning the sampled individuals into a predefined number of clusters. To deduce a fine population structure, it further assumes the admixture model and employs an expectation maximisation algorithm to estimate individual admixture proportions. PopCluster has many features. First, it is one of just a couple of model-based methods that can handle both biallelic and multiallelic markers in the same framework. Second, it is the first population structure analysis method that uses both Message Passing Interface (MPI) and openMP to exploit multiple CPUs with both shared and distributed memories and has the capacity to handle genomic data with millions of individuals and millions of loci. Third, the algorithms for both mixture and admixture analyses are fast, rendering PopCluster favourably in computational efficiency over previous methods in analysing genomic data. Fourth, PopCluster is built for Windows, Linux and Mac platforms, and its Windows version has an integrated GUI that can conveniently visualise analysis results and facilitate data input. Fifth, its Windows version has a built-in simulation module designed to simulate genotype data under admixture, hybridization or migration models. PopCluster provides a valuable toolset for researchers to simulate, infer and visualise individual admixture and population genetic structure, hybridization and migration using marker data.","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":" ","pages":"e14058"},"PeriodicalIF":5.5000,"publicationDate":"2024-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"PopCluster: A Population Genetics Model-Based Toolset for Simulating, Inferring and Visualising Individual Admixture and Population Structure.\",\"authors\":\"Jinliang Wang\",\"doi\":\"10.1111/1755-0998.14058\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this computer note I introduce software, PopCluster, that implements a new likelihood method for unsupervised population structure analysis from marker data. To infer a coarse population structure, it assumes the mixture model and adopts a simulated annealing algorithm to make a maximum likelihood clustering analysis, partitioning the sampled individuals into a predefined number of clusters. To deduce a fine population structure, it further assumes the admixture model and employs an expectation maximisation algorithm to estimate individual admixture proportions. PopCluster has many features. First, it is one of just a couple of model-based methods that can handle both biallelic and multiallelic markers in the same framework. Second, it is the first population structure analysis method that uses both Message Passing Interface (MPI) and openMP to exploit multiple CPUs with both shared and distributed memories and has the capacity to handle genomic data with millions of individuals and millions of loci. Third, the algorithms for both mixture and admixture analyses are fast, rendering PopCluster favourably in computational efficiency over previous methods in analysing genomic data. Fourth, PopCluster is built for Windows, Linux and Mac platforms, and its Windows version has an integrated GUI that can conveniently visualise analysis results and facilitate data input. Fifth, its Windows version has a built-in simulation module designed to simulate genotype data under admixture, hybridization or migration models. PopCluster provides a valuable toolset for researchers to simulate, infer and visualise individual admixture and population genetic structure, hybridization and migration using marker data.\",\"PeriodicalId\":211,\"journal\":{\"name\":\"Molecular Ecology Resources\",\"volume\":\" \",\"pages\":\"e14058\"},\"PeriodicalIF\":5.5000,\"publicationDate\":\"2024-12-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Molecular Ecology Resources\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1111/1755-0998.14058\",\"RegionNum\":1,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOCHEMISTRY & MOLECULAR BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Molecular Ecology Resources","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1111/1755-0998.14058","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

在这篇计算机笔记中，我介绍了PopCluster软件，它实现了一种新的基于标记数据的无监督总体结构分析的似然方法。为了推断出粗略的总体结构，它假设混合模型，并采用模拟退火算法进行最大似然聚类分析，将样本个体划分到预定义数量的聚类中。为了推导出精细的种群结构，进一步假设外加剂模型，并采用期望最大化算法估计个体外加剂比例。PopCluster有许多特性。首先，它是几种基于模型的方法之一，可以在同一框架中处理双等位基因和多等位基因标记。其次，它是第一个同时使用消息传递接口（Message Passing Interface， MPI）和openMP来利用具有共享和分布式内存的多个cpu的种群结构分析方法，具有处理数百万个个体和数百万个位点的基因组数据的能力。第三，混合和混合分析的算法都是快速的，使得PopCluster在计算效率上优于先前的基因组数据分析方法。第四，PopCluster是针对Windows， Linux和Mac平台构建的，其Windows版本具有集成的GUI，可以方便地将分析结果可视化并方便数据输入。第五，它的Windows版本有一个内置的模拟模块，用于模拟混合、杂交或迁移模型下的基因型数据。PopCluster为研究人员提供了一个有价值的工具集来模拟，推断和可视化个体混合和群体遗传结构，杂交和迁移使用标记数据。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

PopCluster: A Population Genetics Model-Based Toolset for Simulating, Inferring and Visualising Individual Admixture and Population Structure.

In this computer note I introduce software, PopCluster, that implements a new likelihood method for unsupervised population structure analysis from marker data. To infer a coarse population structure, it assumes the mixture model and adopts a simulated annealing algorithm to make a maximum likelihood clustering analysis, partitioning the sampled individuals into a predefined number of clusters. To deduce a fine population structure, it further assumes the admixture model and employs an expectation maximisation algorithm to estimate individual admixture proportions. PopCluster has many features. First, it is one of just a couple of model-based methods that can handle both biallelic and multiallelic markers in the same framework. Second, it is the first population structure analysis method that uses both Message Passing Interface (MPI) and openMP to exploit multiple CPUs with both shared and distributed memories and has the capacity to handle genomic data with millions of individuals and millions of loci. Third, the algorithms for both mixture and admixture analyses are fast, rendering PopCluster favourably in computational efficiency over previous methods in analysing genomic data. Fourth, PopCluster is built for Windows, Linux and Mac platforms, and its Windows version has an integrated GUI that can conveniently visualise analysis results and facilitate data input. Fifth, its Windows version has a built-in simulation module designed to simulate genotype data under admixture, hybridization or migration models. PopCluster provides a valuable toolset for researchers to simulate, infer and visualise individual admixture and population genetic structure, hybridization and migration using marker data.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Molecular Ecology Resources 生物-进化生物学

CiteScore

15.60

自引率

5.20%

发文量

170

审稿时长

3 months

期刊介绍： Molecular Ecology Resources promotes the creation of comprehensive resources for the scientific community, encompassing computer programs, statistical and molecular advancements, and a diverse array of molecular tools. Serving as a conduit for disseminating these resources, the journal targets a broad audience of researchers in the fields of evolution, ecology, and conservation. Articles in Molecular Ecology Resources are crafted to support investigations tackling significant questions within these disciplines. In addition to original resource articles, Molecular Ecology Resources features Reviews, Opinions, and Comments relevant to the field. The journal also periodically releases Special Issues focusing on resource development within specific areas.