{"title":"HMEC:个体单倍型最小纠错的启发式算法。","authors":"Md Shamsuzzoha Bayzid, Md Maksudul Alam, Abdullah Mueen, Md Saidur Rahman","doi":"10.1155/2013/291741","DOIUrl":null,"url":null,"abstract":"<p><p>Haplotype is a pattern of single nucleotide polymorphisms (SNPs) on a single chromosome. Constructing a pair of haplotypes from aligned and overlapping but intermixed and erroneous fragments of the chromosomal sequences is a nontrivial problem. Minimum error correction approach aims to minimize the number of errors to be corrected so that the pair of haplotypes can be constructed through consensus of the fragments. We give a heuristic algorithm (HMEC) that searches through alternative solutions using a gain measure and stops whenever no better solution can be achieved. Time complexity of each iteration is O(m (3) k) for an m × k SNP matrix where m and k are the number of fragments (number of rows) and number of SNP sites (number of columns), respectively, in an SNP matrix. Alternative gain measure is also given to reduce running time. We have compared our algorithm with other methods in terms of accuracy and running time on both simulated and real data, and our extensive experimental results indicate the superiority of our algorithm over others. </p>","PeriodicalId":90877,"journal":{"name":"ISRN bioinformatics","volume":"2013 ","pages":"291741"},"PeriodicalIF":0.0000,"publicationDate":"2013-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1155/2013/291741","citationCount":"8","resultStr":"{\"title\":\"HMEC: A Heuristic Algorithm for Individual Haplotyping with Minimum Error Correction.\",\"authors\":\"Md Shamsuzzoha Bayzid, Md Maksudul Alam, Abdullah Mueen, Md Saidur Rahman\",\"doi\":\"10.1155/2013/291741\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Haplotype is a pattern of single nucleotide polymorphisms (SNPs) on a single chromosome. Constructing a pair of haplotypes from aligned and overlapping but intermixed and erroneous fragments of the chromosomal sequences is a nontrivial problem. Minimum error correction approach aims to minimize the number of errors to be corrected so that the pair of haplotypes can be constructed through consensus of the fragments. We give a heuristic algorithm (HMEC) that searches through alternative solutions using a gain measure and stops whenever no better solution can be achieved. Time complexity of each iteration is O(m (3) k) for an m × k SNP matrix where m and k are the number of fragments (number of rows) and number of SNP sites (number of columns), respectively, in an SNP matrix. Alternative gain measure is also given to reduce running time. We have compared our algorithm with other methods in terms of accuracy and running time on both simulated and real data, and our extensive experimental results indicate the superiority of our algorithm over others. </p>\",\"PeriodicalId\":90877,\"journal\":{\"name\":\"ISRN bioinformatics\",\"volume\":\"2013 \",\"pages\":\"291741\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-01-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1155/2013/291741\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ISRN bioinformatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1155/2013/291741\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2013/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ISRN bioinformatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1155/2013/291741","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2013/1/1 0:00:00","PubModel":"eCollection","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
摘要
单倍型是单染色体上的单核苷酸多态性(snp)的一种模式。从染色体序列中排列重叠但混杂错误的片段构建一对单倍型是一个非常重要的问题。最小纠错法的目的是尽量减少需要纠正的错误数量,从而通过片段的一致构建对单倍型。我们给出了一种启发式算法(HMEC),它使用增益度量搜索备选解决方案,并在没有更好的解决方案时停止。对于m × k SNP矩阵,每次迭代的时间复杂度为O(m (3) k),其中m和k分别为SNP矩阵中的片段数(行数)和SNP位点数(列数)。为了缩短运行时间,还给出了备选增益措施。在模拟数据和真实数据上,我们将我们的算法与其他方法在精度和运行时间方面进行了比较,我们大量的实验结果表明我们的算法优于其他算法。
HMEC: A Heuristic Algorithm for Individual Haplotyping with Minimum Error Correction.
Haplotype is a pattern of single nucleotide polymorphisms (SNPs) on a single chromosome. Constructing a pair of haplotypes from aligned and overlapping but intermixed and erroneous fragments of the chromosomal sequences is a nontrivial problem. Minimum error correction approach aims to minimize the number of errors to be corrected so that the pair of haplotypes can be constructed through consensus of the fragments. We give a heuristic algorithm (HMEC) that searches through alternative solutions using a gain measure and stops whenever no better solution can be achieved. Time complexity of each iteration is O(m (3) k) for an m × k SNP matrix where m and k are the number of fragments (number of rows) and number of SNP sites (number of columns), respectively, in an SNP matrix. Alternative gain measure is also given to reduce running time. We have compared our algorithm with other methods in terms of accuracy and running time on both simulated and real data, and our extensive experimental results indicate the superiority of our algorithm over others.