基于隐马尔可夫模型的多重蛋白质比对

Sixth International Conference on Machine Learning and Applications (ICMLA 2007) Pub Date : 2007-12-13 DOI:10.1109/ICMLA.2007.90

Jia Song, Chunmei Liu, Yinglei Song, Junfeng Qu

{"title":"基于隐马尔可夫模型的多重蛋白质比对","authors":"Jia Song, Chunmei Liu, Yinglei Song, Junfeng Qu","doi":"10.1109/ICMLA.2007.90","DOIUrl":null,"url":null,"abstract":"The alignment of multiple protein sequences is a problem of fundamental importance in bioinformatics. In general, the optimal alignment can be obtained through the optimization of an objective function. However, such an optimization task is often computationally intractible, most of the existing alignment tools thus use statistical or machine learning based methods to avoid direct optimizations. In this paper, we develop a new method that can progressively construct and update a set of alignments by adding sequences in certain order to each of the existing alignments. In particular, each of the existing alignments is modeled with a profile hidden markov model (HMM) and an added sequence is aligned to each of these profile HMMs. The profile HMMs in the set are then updated based on the alignments with leading alignment scores. We performed experiments on BaliBASE benchmarks to compare the performance of this new approach with that of other alignment tools. Our experiments showed that, by introducing an integer parameter that controls the number of profile HMMs in the set, we are able to efficiently explore the alignment space and significantly improve the alignment accuracy on sequences with low similarity.","PeriodicalId":448863,"journal":{"name":"Sixth International Conference on Machine Learning and Applications (ICMLA 2007)","volume":"05 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Alignment of Multiple Proteins with an Ensemble of Hidden Markov Models\",\"authors\":\"Jia Song, Chunmei Liu, Yinglei Song, Junfeng Qu\",\"doi\":\"10.1109/ICMLA.2007.90\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The alignment of multiple protein sequences is a problem of fundamental importance in bioinformatics. In general, the optimal alignment can be obtained through the optimization of an objective function. However, such an optimization task is often computationally intractible, most of the existing alignment tools thus use statistical or machine learning based methods to avoid direct optimizations. In this paper, we develop a new method that can progressively construct and update a set of alignments by adding sequences in certain order to each of the existing alignments. In particular, each of the existing alignments is modeled with a profile hidden markov model (HMM) and an added sequence is aligned to each of these profile HMMs. The profile HMMs in the set are then updated based on the alignments with leading alignment scores. We performed experiments on BaliBASE benchmarks to compare the performance of this new approach with that of other alignment tools. Our experiments showed that, by introducing an integer parameter that controls the number of profile HMMs in the set, we are able to efficiently explore the alignment space and significantly improve the alignment accuracy on sequences with low similarity.\",\"PeriodicalId\":448863,\"journal\":{\"name\":\"Sixth International Conference on Machine Learning and Applications (ICMLA 2007)\",\"volume\":\"05 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2007-12-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Sixth International Conference on Machine Learning and Applications (ICMLA 2007)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICMLA.2007.90\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Sixth International Conference on Machine Learning and Applications (ICMLA 2007)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA.2007.90","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 8

摘要

多蛋白序列的比对是生物信息学中一个非常重要的问题。一般来说，最优对齐可以通过目标函数的优化来获得。然而，这样的优化任务通常在计算上难以解决，因此大多数现有的对齐工具使用基于统计或机器学习的方法来避免直接优化。在本文中，我们开发了一种新的方法，通过在每个已存在的序列上按一定的顺序添加序列来逐步构建和更新一组序列。特别是，每个现有的排列都用一个剖面隐马尔可夫模型(HMM)建模，并将一个添加的序列对齐到每个这些剖面HMM。然后根据具有领先对齐分数的对齐来更新集合中的轮廓hmm。我们在BaliBASE基准上进行了实验，以比较这种新方法与其他校准工具的性能。实验表明，通过引入一个整数参数来控制集合中轮廓hmm的数量，我们能够有效地探索对齐空间，并显著提高低相似度序列的对齐精度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Alignment of Multiple Proteins with an Ensemble of Hidden Markov Models

The alignment of multiple protein sequences is a problem of fundamental importance in bioinformatics. In general, the optimal alignment can be obtained through the optimization of an objective function. However, such an optimization task is often computationally intractible, most of the existing alignment tools thus use statistical or machine learning based methods to avoid direct optimizations. In this paper, we develop a new method that can progressively construct and update a set of alignments by adding sequences in certain order to each of the existing alignments. In particular, each of the existing alignments is modeled with a profile hidden markov model (HMM) and an added sequence is aligned to each of these profile HMMs. The profile HMMs in the set are then updated based on the alignments with leading alignment scores. We performed experiments on BaliBASE benchmarks to compare the performance of this new approach with that of other alignment tools. Our experiments showed that, by introducing an integer parameter that controls the number of profile HMMs in the set, we are able to efficiently explore the alignment space and significantly improve the alignment accuracy on sequences with low similarity.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Sixth International Conference on Machine Learning and Applications (ICMLA 2007)

自引率

0.00%

发文量