利用蛋白质结构域信息进行多序列比对

2012 IEEE 12th International Conference on Bioinformatics & Bioengineering (BIBE) Pub Date : 2012-11-11 DOI:10.1109/BIBE.2012.6399667

Layal Al Ait, Eduardo Corel, B. Morgenstern

{"title":"利用蛋白质结构域信息进行多序列比对","authors":"Layal Al Ait, Eduardo Corel, B. Morgenstern","doi":"10.1109/BIBE.2012.6399667","DOIUrl":null,"url":null,"abstract":"Most approaches to multiple sequence alignment rely on primary-sequence information. External sources of information, however, can give valuable hints to possible sequence homologies that may not be obvious from sequence comparison alone. Given the huge amount of sequence annotation that is being produced on a daily basis, integrating such external information into the alignment process can contribute to produce biologically more meaningful alignments. In this paper, we investigate different approaches to use existing information about protein domains for improved multiple alignments. We use the PFAM database to identify possible domains in protein sequences, and we use this information to align protein sequences with DIALIGN and with a recently developed graph-theoretical approach to multiple alignment. Test runs on BAliBASE and SABmark show that this approach leads to improved alignments.","PeriodicalId":330164,"journal":{"name":"2012 IEEE 12th International Conference on Bioinformatics & Bioengineering (BIBE)","volume":"39 3-4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Using protein-domain information for multiple sequence alignment\",\"authors\":\"Layal Al Ait, Eduardo Corel, B. Morgenstern\",\"doi\":\"10.1109/BIBE.2012.6399667\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Most approaches to multiple sequence alignment rely on primary-sequence information. External sources of information, however, can give valuable hints to possible sequence homologies that may not be obvious from sequence comparison alone. Given the huge amount of sequence annotation that is being produced on a daily basis, integrating such external information into the alignment process can contribute to produce biologically more meaningful alignments. In this paper, we investigate different approaches to use existing information about protein domains for improved multiple alignments. We use the PFAM database to identify possible domains in protein sequences, and we use this information to align protein sequences with DIALIGN and with a recently developed graph-theoretical approach to multiple alignment. Test runs on BAliBASE and SABmark show that this approach leads to improved alignments.\",\"PeriodicalId\":330164,\"journal\":{\"name\":\"2012 IEEE 12th International Conference on Bioinformatics & Bioengineering (BIBE)\",\"volume\":\"39 3-4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-11-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 IEEE 12th International Conference on Bioinformatics & Bioengineering (BIBE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/BIBE.2012.6399667\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE 12th International Conference on Bioinformatics & Bioengineering (BIBE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBE.2012.6399667","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

摘要

大多数多序列比对方法依赖于主序列信息。然而，外部信息来源可以为可能的序列同源性提供有价值的提示，这些提示可能仅从序列比较中不明显。考虑到每天产生的大量序列注释，将这些外部信息集成到比对过程中可以有助于产生生物学上更有意义的比对。在本文中，我们研究了不同的方法来利用现有的蛋白质结构域信息来改进多重比对。我们使用PFAM数据库来识别蛋白质序列中可能的结构域，并利用这些信息将蛋白质序列与DIALIGN和最近开发的图形理论方法进行多重比对。在BAliBASE和SABmark上运行的测试表明，这种方法可以改善对齐。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Using protein-domain information for multiple sequence alignment

Most approaches to multiple sequence alignment rely on primary-sequence information. External sources of information, however, can give valuable hints to possible sequence homologies that may not be obvious from sequence comparison alone. Given the huge amount of sequence annotation that is being produced on a daily basis, integrating such external information into the alignment process can contribute to produce biologically more meaningful alignments. In this paper, we investigate different approaches to use existing information about protein domains for improved multiple alignments. We use the PFAM database to identify possible domains in protein sequences, and we use this information to align protein sequences with DIALIGN and with a recently developed graph-theoretical approach to multiple alignment. Test runs on BAliBASE and SABmark show that this approach leads to improved alignments.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2012 IEEE 12th International Conference on Bioinformatics & Bioengineering (BIBE)

自引率

0.00%

发文量