搜索显示相当大的结构域长度变化的蛋白质序列同源物

Eshita Mutt, A. Mitra, R. Sowdhamini
{"title":"搜索显示相当大的结构域长度变化的蛋白质序列同源物","authors":"Eshita Mutt, A. Mitra, R. Sowdhamini","doi":"10.4018/jkdb.2011040104","DOIUrl":null,"url":null,"abstract":"Independent folding units which have the capability of carrying out biological functions have been classified as “protein domains†. These minimal structural units lead not only to considerable sequence changes of protein domains of similar folds and functions, but also gives rise to remarkable length variations under evolutionary pressure. Rapid and heuristic sequence search algorithms are generally sensitive and effective in recognizing protein domains that are distantly related within large sequence databases, but are not well-suited to identify remote homologues of varying lengths. An even more challenging aspect is introduced to distinguish reliable hits from a vast number of putative false positives that could have suboptimal sequence similarities. Here, the authors present a data-mining approach that provides stage-specific filters in sequence searches to reliably accumulate remote homologues, which encourages sampling of length variations albeit with a low false positive rate. Realization of such remote homologues with vivid length variations could contribute to better understanding of functional variety within protein domain superfamilies.","PeriodicalId":160270,"journal":{"name":"Int. J. Knowl. Discov. Bioinform.","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Search for Protein Sequence Homologues that Display Considerable Domain Length Variations\",\"authors\":\"Eshita Mutt, A. Mitra, R. Sowdhamini\",\"doi\":\"10.4018/jkdb.2011040104\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Independent folding units which have the capability of carrying out biological functions have been classified as “protein domains†. These minimal structural units lead not only to considerable sequence changes of protein domains of similar folds and functions, but also gives rise to remarkable length variations under evolutionary pressure. Rapid and heuristic sequence search algorithms are generally sensitive and effective in recognizing protein domains that are distantly related within large sequence databases, but are not well-suited to identify remote homologues of varying lengths. An even more challenging aspect is introduced to distinguish reliable hits from a vast number of putative false positives that could have suboptimal sequence similarities. Here, the authors present a data-mining approach that provides stage-specific filters in sequence searches to reliably accumulate remote homologues, which encourages sampling of length variations albeit with a low false positive rate. Realization of such remote homologues with vivid length variations could contribute to better understanding of functional variety within protein domain superfamilies.\",\"PeriodicalId\":160270,\"journal\":{\"name\":\"Int. J. Knowl. Discov. Bioinform.\",\"volume\":\"19 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Int. J. Knowl. Discov. Bioinform.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.4018/jkdb.2011040104\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Int. J. Knowl. Discov. Bioinform.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4018/jkdb.2011040104","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8

摘要

具有执行生物功能能力的独立折叠单元被归类为€œprotein域s€。这些最小的结构单元不仅导致了相似折叠和功能的蛋白质结构域的序列发生了相当大的变化,而且在进化压力下也导致了显著的长度变化。快速和启发式序列搜索算法在识别大型序列数据库中的远亲蛋白结构域时通常是敏感和有效的,但不适合识别不同长度的远程同源物。引入了一个更具挑战性的方面,即从大量可能具有次优序列相似性的假定假阳性中区分可靠的命中。在这里,作者提出了一种数据挖掘方法,该方法在序列搜索中提供了特定阶段的过滤器,以可靠地积累远程同源物,这鼓励对长度变化进行采样,尽管假阳性率很低。实现这种具有生动长度变化的远程同源物有助于更好地理解蛋白质结构域超家族的功能多样性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Search for Protein Sequence Homologues that Display Considerable Domain Length Variations
Independent folding units which have the capability of carrying out biological functions have been classified as “protein domains†. These minimal structural units lead not only to considerable sequence changes of protein domains of similar folds and functions, but also gives rise to remarkable length variations under evolutionary pressure. Rapid and heuristic sequence search algorithms are generally sensitive and effective in recognizing protein domains that are distantly related within large sequence databases, but are not well-suited to identify remote homologues of varying lengths. An even more challenging aspect is introduced to distinguish reliable hits from a vast number of putative false positives that could have suboptimal sequence similarities. Here, the authors present a data-mining approach that provides stage-specific filters in sequence searches to reliably accumulate remote homologues, which encourages sampling of length variations albeit with a low false positive rate. Realization of such remote homologues with vivid length variations could contribute to better understanding of functional variety within protein domain superfamilies.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信