{"title":"Parallelization of BLAST with MapReduce for Long Sequence Alignment","authors":"Xiaoliang Yang, Yulong Liu, C. Yuan, Yihua Huang","doi":"10.1109/PAAP.2011.36","DOIUrl":null,"url":null,"abstract":"Sequence alignment is of great importance in biology research. BLAST is a sequence alignment tool used extensively by researchers. However the continuously increasing amount of sequence data to be processed presents many challenges to it. This paper gives a simple and effective approach to parallelizing BLAST using the MapReduce technique. The MapReduce-BLAST shows very good performance and scales nearly linearly to the database size and query length. This results from both the power of MapReduce and the inherent parallel characteristics of the BLAST algorithm. Sequence alignment algorithms based on techniques similar with BLAST's seed-and-extend approach are very suitable for being parallelized with MapReduce.","PeriodicalId":213010,"journal":{"name":"2011 Fourth International Symposium on Parallel Architectures, Algorithms and Programming","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 Fourth International Symposium on Parallel Architectures, Algorithms and Programming","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PAAP.2011.36","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 17
Abstract
Sequence alignment is of great importance in biology research. BLAST is a sequence alignment tool used extensively by researchers. However the continuously increasing amount of sequence data to be processed presents many challenges to it. This paper gives a simple and effective approach to parallelizing BLAST using the MapReduce technique. The MapReduce-BLAST shows very good performance and scales nearly linearly to the database size and query length. This results from both the power of MapReduce and the inherent parallel characteristics of the BLAST algorithm. Sequence alignment algorithms based on techniques similar with BLAST's seed-and-extend approach are very suitable for being parallelized with MapReduce.