在线性时间内解码聚结隐马尔可夫模型。

Research in computational molecular biology : ... Annual International Conference, RECOMB ... : proceedings. RECOMB (Conference : 2005- ) Pub Date : 2014-01-01 DOI:10.1007/978-3-319-05269-4_8

Kelley Harris, Sara Sheehan, John A Kamm, Yun S Song

{"title":"在线性时间内解码聚结隐马尔可夫模型。","authors":"Kelley Harris, Sara Sheehan, John A Kamm, Yun S Song","doi":"10.1007/978-3-319-05269-4_8","DOIUrl":null,"url":null,"abstract":"In many areas of computational biology, hidden Markov models (HMMs) have been used to model local genomic features. In particular, coalescent HMMs have been used to infer ancient population sizes, migration rates, divergence times, and other parameters such as mutation and recombination rates. As more loci, sequences, and hidden states are added to the model, however, the runtime of coalescent HMMs can quickly become prohibitive. Here we present a new algorithm for reducing the runtime of coalescent HMMs from quadratic in the number of hidden time states to linear, without making any additional approximations. Our algorithm can be incorporated into various coalescent HMMs, including the popular method PSMC for inferring variable effective population sizes. Here we implement this algorithm to speed up our demographic inference method diCal, which is equivalent to PSMC when applied to a sample of two haplotypes. We demonstrate that the linear-time method can reconstruct a population size change history more accurately than the quadratic-time method, given similar computation resources. We also apply the method to data from the 1000 Genomes project, inferring a high-resolution history of size changes in the European population.","PeriodicalId":74675,"journal":{"name":"Research in computational molecular biology : ... Annual International Conference, RECOMB ... : proceedings. RECOMB (Conference : 2005- )","volume":"8394 ","pages":"100-114"},"PeriodicalIF":0.0000,"publicationDate":"2014-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1007/978-3-319-05269-4_8","citationCount":"10","resultStr":"{\"title\":\"Decoding coalescent hidden Markov models in linear time.\",\"authors\":\"Kelley Harris, Sara Sheehan, John A Kamm, Yun S Song\",\"doi\":\"10.1007/978-3-319-05269-4_8\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In many areas of computational biology, hidden Markov models (HMMs) have been used to model local genomic features. In particular, coalescent HMMs have been used to infer ancient population sizes, migration rates, divergence times, and other parameters such as mutation and recombination rates. As more loci, sequences, and hidden states are added to the model, however, the runtime of coalescent HMMs can quickly become prohibitive. Here we present a new algorithm for reducing the runtime of coalescent HMMs from quadratic in the number of hidden time states to linear, without making any additional approximations. Our algorithm can be incorporated into various coalescent HMMs, including the popular method PSMC for inferring variable effective population sizes. Here we implement this algorithm to speed up our demographic inference method diCal, which is equivalent to PSMC when applied to a sample of two haplotypes. We demonstrate that the linear-time method can reconstruct a population size change history more accurately than the quadratic-time method, given similar computation resources. We also apply the method to data from the 1000 Genomes project, inferring a high-resolution history of size changes in the European population.\",\"PeriodicalId\":74675,\"journal\":{\"name\":\"Research in computational molecular biology : ... Annual International Conference, RECOMB ... : proceedings. RECOMB (Conference : 2005- )\",\"volume\":\"8394 \",\"pages\":\"100-114\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1007/978-3-319-05269-4_8\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Research in computational molecular biology : ... Annual International Conference, RECOMB ... : proceedings. RECOMB (Conference : 2005- )\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1007/978-3-319-05269-4_8\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Research in computational molecular biology : ... Annual International Conference, RECOMB ... : proceedings. RECOMB (Conference : 2005- )","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/978-3-319-05269-4_8","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 10

摘要

在计算生物学的许多领域，隐马尔可夫模型(hmm)已被用于模拟局部基因组特征。特别是，聚结hmm已被用于推断古代种群规模、迁移率、分化时间和其他参数，如突变和重组率。然而，随着更多的基因座、序列和隐藏状态被添加到模型中，聚合hmm的运行时间很快就会变得令人望而却步。在此，我们提出了一种新的算法，在不做任何额外近似的情况下，将合并hmm的运行时间从隐藏时间状态数量的二次型减少到线性型。我们的算法可以应用到各种聚结hmm中，包括常用的PSMC方法来推断可变有效种群大小。在这里，我们实现了这个算法来加速我们的人口统计推断方法dicc，当应用于两个单倍型的样本时，它相当于PSMC。结果表明，在计算资源相似的情况下，线性时间方法比二次时间方法能更准确地重建种群规模变化历史。我们还将该方法应用于1000基因组计划的数据，推断出欧洲人口规模变化的高分辨率历史。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Decoding coalescent hidden Markov models in linear time.

查看原文本刊更多论文

Decoding coalescent hidden Markov models in linear time.

In many areas of computational biology, hidden Markov models (HMMs) have been used to model local genomic features. In particular, coalescent HMMs have been used to infer ancient population sizes, migration rates, divergence times, and other parameters such as mutation and recombination rates. As more loci, sequences, and hidden states are added to the model, however, the runtime of coalescent HMMs can quickly become prohibitive. Here we present a new algorithm for reducing the runtime of coalescent HMMs from quadratic in the number of hidden time states to linear, without making any additional approximations. Our algorithm can be incorporated into various coalescent HMMs, including the popular method PSMC for inferring variable effective population sizes. Here we implement this algorithm to speed up our demographic inference method diCal, which is equivalent to PSMC when applied to a sample of two haplotypes. We demonstrate that the linear-time method can reconstruct a population size change history more accurately than the quadratic-time method, given similar computation resources. We also apply the method to data from the 1000 Genomes project, inferring a high-resolution history of size changes in the European population.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Research in computational molecular biology : ... Annual International Conference, RECOMB ... : proceedings. RECOMB (Conference : 2005- )

自引率

0.00%

发文量