{"title":"最优序列比对的快速剪枝算法","authors":"Aaron Davidson","doi":"10.1109/BIBE.2001.974411","DOIUrl":null,"url":null,"abstract":"Sequence alignment is an important operation in computational biology. Both dynamic programming and A* heuristic search algorithms for optimal sequence alignment are discussed and evaluated Presented here are two new algorithms for optimal pairwise sequence alignment which outperform traditional methods on very large problem instances (hundreds of thousands of characters, for example). The technique combines the benefits of dynamic programming and A* heuristic search, with a minimal amount of additional overhead. The dynamic programming matrix is traversed along antidiagonals, bounding the computation to exclude portions of the matrix that cannot contain optimal paths. An admissible heuristic assists in pruning away unnecessary areas of the matrix, while preserving optimal solutions for any given scoring function. Since memory requirements are a major concern for large sequence alignment problems, it is shown how the standard algorithm (requiring quadratic space) can be reformulated as a divide and conquer algorithm (requiring only linear space, at the cost of some recomputuation).","PeriodicalId":405124,"journal":{"name":"Proceedings 2nd Annual IEEE International Symposium on Bioinformatics and Bioengineering (BIBE 2001)","volume":"147 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2001-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"26","resultStr":"{\"title\":\"A fast pruning algorithm for optimal sequence alignment\",\"authors\":\"Aaron Davidson\",\"doi\":\"10.1109/BIBE.2001.974411\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Sequence alignment is an important operation in computational biology. Both dynamic programming and A* heuristic search algorithms for optimal sequence alignment are discussed and evaluated Presented here are two new algorithms for optimal pairwise sequence alignment which outperform traditional methods on very large problem instances (hundreds of thousands of characters, for example). The technique combines the benefits of dynamic programming and A* heuristic search, with a minimal amount of additional overhead. The dynamic programming matrix is traversed along antidiagonals, bounding the computation to exclude portions of the matrix that cannot contain optimal paths. An admissible heuristic assists in pruning away unnecessary areas of the matrix, while preserving optimal solutions for any given scoring function. Since memory requirements are a major concern for large sequence alignment problems, it is shown how the standard algorithm (requiring quadratic space) can be reformulated as a divide and conquer algorithm (requiring only linear space, at the cost of some recomputuation).\",\"PeriodicalId\":405124,\"journal\":{\"name\":\"Proceedings 2nd Annual IEEE International Symposium on Bioinformatics and Bioengineering (BIBE 2001)\",\"volume\":\"147 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2001-03-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"26\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings 2nd Annual IEEE International Symposium on Bioinformatics and Bioengineering (BIBE 2001)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/BIBE.2001.974411\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings 2nd Annual IEEE International Symposium on Bioinformatics and Bioengineering (BIBE 2001)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBE.2001.974411","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A fast pruning algorithm for optimal sequence alignment
Sequence alignment is an important operation in computational biology. Both dynamic programming and A* heuristic search algorithms for optimal sequence alignment are discussed and evaluated Presented here are two new algorithms for optimal pairwise sequence alignment which outperform traditional methods on very large problem instances (hundreds of thousands of characters, for example). The technique combines the benefits of dynamic programming and A* heuristic search, with a minimal amount of additional overhead. The dynamic programming matrix is traversed along antidiagonals, bounding the computation to exclude portions of the matrix that cannot contain optimal paths. An admissible heuristic assists in pruning away unnecessary areas of the matrix, while preserving optimal solutions for any given scoring function. Since memory requirements are a major concern for large sequence alignment problems, it is shown how the standard algorithm (requiring quadratic space) can be reformulated as a divide and conquer algorithm (requiring only linear space, at the cost of some recomputuation).