基于贝叶斯对齐的数据库搜索。

J Zhu, R Lüthy, C E Lawrence
{"title":"基于贝叶斯对齐的数据库搜索。","authors":"J Zhu,&nbsp;R Lüthy,&nbsp;C E Lawrence","doi":"","DOIUrl":null,"url":null,"abstract":"<p><p>The size of protein sequence database is getting larger each day. One common challenge is to predict protein structures or functions of the sequences in databases. It is easy when a sequence shares direct similarity to a well-characterized protein. If there is no direct similarity, we have to rely on a third sequence or a model as intermediate to link two proteins together. We developed a new model based method, called Bayesian search, as a means to connect two distantly related proteins. We compared this Bayesian search model with pairwise and multiple sequence comparison methods on structural databases using structural similarity as the criteria for relationship. The results show that the Bayesian search can link more distantly related sequence pairs than other methods, collectively and consistently over large protein families. If each query made one error on average against SCOP database PDB40D-B, Bayesian search found 36.5% of related pairs, PSI-Blast found 32.6%, and Smith-Waterman method found 25%. Examples are presented to show that the alignments predicted by the Bayesian search agree well with structural alignments. Also false positives found by Bayesian search at low cutoff values are analyzed.</p>","PeriodicalId":79420,"journal":{"name":"Proceedings. International Conference on Intelligent Systems for Molecular Biology","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"1999-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Database search based on Bayesian alignment.\",\"authors\":\"J Zhu,&nbsp;R Lüthy,&nbsp;C E Lawrence\",\"doi\":\"\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>The size of protein sequence database is getting larger each day. One common challenge is to predict protein structures or functions of the sequences in databases. It is easy when a sequence shares direct similarity to a well-characterized protein. If there is no direct similarity, we have to rely on a third sequence or a model as intermediate to link two proteins together. We developed a new model based method, called Bayesian search, as a means to connect two distantly related proteins. We compared this Bayesian search model with pairwise and multiple sequence comparison methods on structural databases using structural similarity as the criteria for relationship. The results show that the Bayesian search can link more distantly related sequence pairs than other methods, collectively and consistently over large protein families. If each query made one error on average against SCOP database PDB40D-B, Bayesian search found 36.5% of related pairs, PSI-Blast found 32.6%, and Smith-Waterman method found 25%. Examples are presented to show that the alignments predicted by the Bayesian search agree well with structural alignments. Also false positives found by Bayesian search at low cutoff values are analyzed.</p>\",\"PeriodicalId\":79420,\"journal\":{\"name\":\"Proceedings. International Conference on Intelligent Systems for Molecular Biology\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1999-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings. International Conference on Intelligent Systems for Molecular Biology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. International Conference on Intelligent Systems for Molecular Biology","FirstCategoryId":"1085","ListUrlMain":"","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

蛋白质序列数据库的规模日益扩大。一个常见的挑战是预测数据库中序列的蛋白质结构或功能。当一个序列与一个特征良好的蛋白质有直接的相似性时,这很容易。如果没有直接的相似性,我们必须依靠第三个序列或模型作为中间连接两个蛋白质。我们开发了一种新的基于模型的方法,称为贝叶斯搜索,作为连接两个远亲蛋白的手段。我们以结构相似度作为关系标准,将该贝叶斯搜索模型与结构数据库的两两比对和多序列比对方法进行了比较。结果表明,与其他方法相比,贝叶斯搜索可以连接更多的远亲序列对,在大蛋白质家族中集体和一致。如果每个查询对SCOP数据库PDB40D-B平均产生一次错误,则贝叶斯搜索发现36.5%的相关对,PSI-Blast发现32.6%,Smith-Waterman方法发现25%。算例表明,贝叶斯搜索预测的排列与结构排列吻合较好。同时分析了贝叶斯搜索在低截止值下发现的假阳性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Database search based on Bayesian alignment.

The size of protein sequence database is getting larger each day. One common challenge is to predict protein structures or functions of the sequences in databases. It is easy when a sequence shares direct similarity to a well-characterized protein. If there is no direct similarity, we have to rely on a third sequence or a model as intermediate to link two proteins together. We developed a new model based method, called Bayesian search, as a means to connect two distantly related proteins. We compared this Bayesian search model with pairwise and multiple sequence comparison methods on structural databases using structural similarity as the criteria for relationship. The results show that the Bayesian search can link more distantly related sequence pairs than other methods, collectively and consistently over large protein families. If each query made one error on average against SCOP database PDB40D-B, Bayesian search found 36.5% of related pairs, PSI-Blast found 32.6%, and Smith-Waterman method found 25%. Examples are presented to show that the alignments predicted by the Bayesian search agree well with structural alignments. Also false positives found by Bayesian search at low cutoff values are analyzed.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信