{"title":"利用蛋白质结构域信息进行多序列比对","authors":"Layal Al Ait, Eduardo Corel, B. Morgenstern","doi":"10.1109/BIBE.2012.6399667","DOIUrl":null,"url":null,"abstract":"Most approaches to multiple sequence alignment rely on primary-sequence information. External sources of information, however, can give valuable hints to possible sequence homologies that may not be obvious from sequence comparison alone. Given the huge amount of sequence annotation that is being produced on a daily basis, integrating such external information into the alignment process can contribute to produce biologically more meaningful alignments. In this paper, we investigate different approaches to use existing information about protein domains for improved multiple alignments. We use the PFAM database to identify possible domains in protein sequences, and we use this information to align protein sequences with DIALIGN and with a recently developed graph-theoretical approach to multiple alignment. Test runs on BAliBASE and SABmark show that this approach leads to improved alignments.","PeriodicalId":330164,"journal":{"name":"2012 IEEE 12th International Conference on Bioinformatics & Bioengineering (BIBE)","volume":"39 3-4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Using protein-domain information for multiple sequence alignment\",\"authors\":\"Layal Al Ait, Eduardo Corel, B. Morgenstern\",\"doi\":\"10.1109/BIBE.2012.6399667\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Most approaches to multiple sequence alignment rely on primary-sequence information. External sources of information, however, can give valuable hints to possible sequence homologies that may not be obvious from sequence comparison alone. Given the huge amount of sequence annotation that is being produced on a daily basis, integrating such external information into the alignment process can contribute to produce biologically more meaningful alignments. In this paper, we investigate different approaches to use existing information about protein domains for improved multiple alignments. We use the PFAM database to identify possible domains in protein sequences, and we use this information to align protein sequences with DIALIGN and with a recently developed graph-theoretical approach to multiple alignment. Test runs on BAliBASE and SABmark show that this approach leads to improved alignments.\",\"PeriodicalId\":330164,\"journal\":{\"name\":\"2012 IEEE 12th International Conference on Bioinformatics & Bioengineering (BIBE)\",\"volume\":\"39 3-4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-11-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 IEEE 12th International Conference on Bioinformatics & Bioengineering (BIBE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/BIBE.2012.6399667\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE 12th International Conference on Bioinformatics & Bioengineering (BIBE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBE.2012.6399667","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Using protein-domain information for multiple sequence alignment
Most approaches to multiple sequence alignment rely on primary-sequence information. External sources of information, however, can give valuable hints to possible sequence homologies that may not be obvious from sequence comparison alone. Given the huge amount of sequence annotation that is being produced on a daily basis, integrating such external information into the alignment process can contribute to produce biologically more meaningful alignments. In this paper, we investigate different approaches to use existing information about protein domains for improved multiple alignments. We use the PFAM database to identify possible domains in protein sequences, and we use this information to align protein sequences with DIALIGN and with a recently developed graph-theoretical approach to multiple alignment. Test runs on BAliBASE and SABmark show that this approach leads to improved alignments.