以泊松分布为状态持续概率的跨膜区域预测的广义隐马尔可夫模型方法

T. Kaburagi, Takashi Matsumoto
{"title":"以泊松分布为状态持续概率的跨膜区域预测的广义隐马尔可夫模型方法","authors":"T. Kaburagi, Takashi Matsumoto","doi":"10.2197/IPSJDC.4.193","DOIUrl":null,"url":null,"abstract":"We present a novel algorithm to predict transmembrane regions from a primary amino acid sequence. Previous studies have shown that the Hidden Markov Model (HMM) is one of the powerful tools known to predict transmembrane regions; however, one of the conceptual drawbacks of the standard HMM is the fact that the state duration, i.e., the duration for which the hidden dynamics remains in a particular state follows the geometric distribution. Real data, however, does not always indicate such a geometric distribution. The proposed algorithm utilizes a Generalized Hidden Markov Model (GHMM), an extension of the HMM, to cope with this problem. In the GHMM, the state duration probability can be any discrete distribution, including a geometric distribution. The proposed algorithm employs a state duration probability based on a Poisson distribution. We consider the two-dimensional vector trajectory consisting of hydropathy index and charge associated with amino acids, instead of the 20 letter symbol sequences. Also a Monte Carlo method (Forward/Backward Sampling method) is adopted for the transmembrane region prediction step. Prediction accuracies using publicly available data sets show that the proposed algorithm yields reasonably good results when compared against some existing algorithms.","PeriodicalId":432390,"journal":{"name":"Ipsj Digital Courier","volume":"908 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"A Generalized Hidden Markov Model Approach to Transmembrane Region Prediction with Poisson Distribution as State Duration Probabilities\",\"authors\":\"T. Kaburagi, Takashi Matsumoto\",\"doi\":\"10.2197/IPSJDC.4.193\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We present a novel algorithm to predict transmembrane regions from a primary amino acid sequence. Previous studies have shown that the Hidden Markov Model (HMM) is one of the powerful tools known to predict transmembrane regions; however, one of the conceptual drawbacks of the standard HMM is the fact that the state duration, i.e., the duration for which the hidden dynamics remains in a particular state follows the geometric distribution. Real data, however, does not always indicate such a geometric distribution. The proposed algorithm utilizes a Generalized Hidden Markov Model (GHMM), an extension of the HMM, to cope with this problem. In the GHMM, the state duration probability can be any discrete distribution, including a geometric distribution. The proposed algorithm employs a state duration probability based on a Poisson distribution. We consider the two-dimensional vector trajectory consisting of hydropathy index and charge associated with amino acids, instead of the 20 letter symbol sequences. Also a Monte Carlo method (Forward/Backward Sampling method) is adopted for the transmembrane region prediction step. Prediction accuracies using publicly available data sets show that the proposed algorithm yields reasonably good results when compared against some existing algorithms.\",\"PeriodicalId\":432390,\"journal\":{\"name\":\"Ipsj Digital Courier\",\"volume\":\"908 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-03-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Ipsj Digital Courier\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2197/IPSJDC.4.193\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ipsj Digital Courier","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2197/IPSJDC.4.193","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

我们提出了一种新的算法来预测跨膜区域从初级氨基酸序列。先前的研究表明,隐马尔可夫模型(HMM)是已知的预测跨膜区域的强大工具之一;然而,标准HMM在概念上的缺点之一是状态持续时间,即隐藏动力学保持在特定状态的持续时间遵循几何分布。然而,实际数据并不总是显示出这样的几何分布。该算法利用广义隐马尔可夫模型(GHMM)来解决这一问题,该模型是隐马尔可夫模型的扩展。在GHMM中,状态持续概率可以是任意离散分布,包括几何分布。该算法采用基于泊松分布的状态持续概率。我们考虑由亲水指数和与氨基酸相关的电荷组成的二维矢量轨迹,而不是20个字母符号序列。跨膜区域预测步骤采用蒙特卡罗方法(前向/后向采样方法)。使用公开可用数据集的预测精度表明,与一些现有算法相比,所提出的算法产生了相当好的结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A Generalized Hidden Markov Model Approach to Transmembrane Region Prediction with Poisson Distribution as State Duration Probabilities
We present a novel algorithm to predict transmembrane regions from a primary amino acid sequence. Previous studies have shown that the Hidden Markov Model (HMM) is one of the powerful tools known to predict transmembrane regions; however, one of the conceptual drawbacks of the standard HMM is the fact that the state duration, i.e., the duration for which the hidden dynamics remains in a particular state follows the geometric distribution. Real data, however, does not always indicate such a geometric distribution. The proposed algorithm utilizes a Generalized Hidden Markov Model (GHMM), an extension of the HMM, to cope with this problem. In the GHMM, the state duration probability can be any discrete distribution, including a geometric distribution. The proposed algorithm employs a state duration probability based on a Poisson distribution. We consider the two-dimensional vector trajectory consisting of hydropathy index and charge associated with amino acids, instead of the 20 letter symbol sequences. Also a Monte Carlo method (Forward/Backward Sampling method) is adopted for the transmembrane region prediction step. Prediction accuracies using publicly available data sets show that the proposed algorithm yields reasonably good results when compared against some existing algorithms.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信