{"title":"嘘……概率序列搜索工具","authors":"Crispin J. Miller, T. Attwood","doi":"10.1109/BIBE.2001.974409","DOIUrl":null,"url":null,"abstract":"Whole genome comparison and clustering cannot be routinely performed without access to significant resources. If as expected, repositories continue to grow at the current rate, increasingly large and expensive systems will be required in order to maintain the status quo. The high-proportion of uncharacterised gene-sequences, combined with the fact that the majority of sequence analysis techniques are alignment-based, raises the possibility that alternative approaches might be able to identify relationships that have otherwise been missed. There is a need for alternative ways to predict function. PSST is an analysis tool with parallels to both pairwise algorithms and multiple motif-based pattern approaches. It is significantly faster than BLAST, and for some families including GPCRs, the tool is more sensitive and selective as well. For others it is worse. This paper describes the algorithm, its implementation, its evaluation against a diverse set of protein families, and discusses the reasons behind its behaviour.","PeriodicalId":405124,"journal":{"name":"Proceedings 2nd Annual IEEE International Symposium on Bioinformatics and Bioengineering (BIBE 2001)","volume":"8 7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2001-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"PSST... the probabilistic sequence search tool\",\"authors\":\"Crispin J. Miller, T. Attwood\",\"doi\":\"10.1109/BIBE.2001.974409\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Whole genome comparison and clustering cannot be routinely performed without access to significant resources. If as expected, repositories continue to grow at the current rate, increasingly large and expensive systems will be required in order to maintain the status quo. The high-proportion of uncharacterised gene-sequences, combined with the fact that the majority of sequence analysis techniques are alignment-based, raises the possibility that alternative approaches might be able to identify relationships that have otherwise been missed. There is a need for alternative ways to predict function. PSST is an analysis tool with parallels to both pairwise algorithms and multiple motif-based pattern approaches. It is significantly faster than BLAST, and for some families including GPCRs, the tool is more sensitive and selective as well. For others it is worse. This paper describes the algorithm, its implementation, its evaluation against a diverse set of protein families, and discusses the reasons behind its behaviour.\",\"PeriodicalId\":405124,\"journal\":{\"name\":\"Proceedings 2nd Annual IEEE International Symposium on Bioinformatics and Bioengineering (BIBE 2001)\",\"volume\":\"8 7 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2001-03-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings 2nd Annual IEEE International Symposium on Bioinformatics and Bioengineering (BIBE 2001)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/BIBE.2001.974409\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings 2nd Annual IEEE International Symposium on Bioinformatics and Bioengineering (BIBE 2001)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBE.2001.974409","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Whole genome comparison and clustering cannot be routinely performed without access to significant resources. If as expected, repositories continue to grow at the current rate, increasingly large and expensive systems will be required in order to maintain the status quo. The high-proportion of uncharacterised gene-sequences, combined with the fact that the majority of sequence analysis techniques are alignment-based, raises the possibility that alternative approaches might be able to identify relationships that have otherwise been missed. There is a need for alternative ways to predict function. PSST is an analysis tool with parallels to both pairwise algorithms and multiple motif-based pattern approaches. It is significantly faster than BLAST, and for some families including GPCRs, the tool is more sensitive and selective as well. For others it is worse. This paper describes the algorithm, its implementation, its evaluation against a diverse set of protein families, and discusses the reasons behind its behaviour.