FMSA:通过流水线预滤波的fpga加速集群多序列比对

2012 IEEE 20th International Symposium on Field-Programmable Custom Computing Machines Pub Date : 2012-04-29 DOI:10.1109/FCCM.2012.38

A. Mahram, M. Herbordt

{"title":"FMSA:通过流水线预滤波的fpga加速集群多序列比对","authors":"A. Mahram, M. Herbordt","doi":"10.1109/FCCM.2012.38","DOIUrl":null,"url":null,"abstract":"Multiple Sequence Alignment (MSA) is perhaps second only to sequence alignment in overall importance in Bioinformatics, being critical, e.g., in determining the structure and function of molecules from putative families of sequences. But while pair wise sequence alignment has been the subject of scores of FPGA acceleration studies, MSA only a few. The most important of these accelerate Clustal-W, the most commonly used MSA code, by either implementing the first of three phases (over 90% of the run time) with Dynamic Programming (DP) methods, or by accelerating the third phase which consumes most of the remaining time. We use a new approach: we apply prefiltering of the kind commonly used in BLAST to perform the initial all-pairs alignments. This results in a speedup of from 80× to 190× over the CPU code (8 cores) and speedup of from 2.5× to 8× over DP/FPGA- and GPU-based methods. When combined with a recently published method for phase 3, and using the original software for phase 2, the end-to-end speedup is at least 50× over an 8-core implementation of the original code. The quality is comparable to the original according to a commonly used benchmark suite evaluated with respect to multiple distance metrics.","PeriodicalId":226197,"journal":{"name":"2012 IEEE 20th International Symposium on Field-Programmable Custom Computing Machines","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":"{\"title\":\"FMSA: FPGA-Accelerated ClustalW-Based Multiple Sequence Alignment through Pipelined Prefiltering\",\"authors\":\"A. Mahram, M. Herbordt\",\"doi\":\"10.1109/FCCM.2012.38\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Multiple Sequence Alignment (MSA) is perhaps second only to sequence alignment in overall importance in Bioinformatics, being critical, e.g., in determining the structure and function of molecules from putative families of sequences. But while pair wise sequence alignment has been the subject of scores of FPGA acceleration studies, MSA only a few. The most important of these accelerate Clustal-W, the most commonly used MSA code, by either implementing the first of three phases (over 90% of the run time) with Dynamic Programming (DP) methods, or by accelerating the third phase which consumes most of the remaining time. We use a new approach: we apply prefiltering of the kind commonly used in BLAST to perform the initial all-pairs alignments. This results in a speedup of from 80× to 190× over the CPU code (8 cores) and speedup of from 2.5× to 8× over DP/FPGA- and GPU-based methods. When combined with a recently published method for phase 3, and using the original software for phase 2, the end-to-end speedup is at least 50× over an 8-core implementation of the original code. The quality is comparable to the original according to a commonly used benchmark suite evaluated with respect to multiple distance metrics.\",\"PeriodicalId\":226197,\"journal\":{\"name\":\"2012 IEEE 20th International Symposium on Field-Programmable Custom Computing Machines\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-04-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"16\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 IEEE 20th International Symposium on Field-Programmable Custom Computing Machines\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/FCCM.2012.38\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE 20th International Symposium on Field-Programmable Custom Computing Machines","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FCCM.2012.38","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 16

摘要

多序列比对(MSA)在生物信息学中的总体重要性可能仅次于序列比对，例如，在确定来自假定序列家族的分子的结构和功能方面至关重要。但是，虽然对明智的序列对齐一直是FPGA加速研究的主题，但MSA只有少数。其中最重要的是通过使用动态规划(DP)方法实现三个阶段中的第一个阶段(超过90%的运行时间)，或者通过加速消耗大部分剩余时间的第三阶段，来加速最常用的MSA代码Clustal-W。我们使用了一种新的方法:我们使用BLAST中常用的那种预滤波来执行初始全对对齐。这使得CPU代码(8核)的速度从80倍提高到190倍，基于DP/FPGA和gpu的方法的速度从2.5倍提高到8倍。当与阶段3中最近发布的方法结合使用，并在阶段2中使用原始软件时，端到端加速比原始代码的8核实现至少提高50倍。根据根据多个距离度量评估的常用基准套件，质量与原始产品相当。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

FMSA: FPGA-Accelerated ClustalW-Based Multiple Sequence Alignment through Pipelined Prefiltering

Multiple Sequence Alignment (MSA) is perhaps second only to sequence alignment in overall importance in Bioinformatics, being critical, e.g., in determining the structure and function of molecules from putative families of sequences. But while pair wise sequence alignment has been the subject of scores of FPGA acceleration studies, MSA only a few. The most important of these accelerate Clustal-W, the most commonly used MSA code, by either implementing the first of three phases (over 90% of the run time) with Dynamic Programming (DP) methods, or by accelerating the third phase which consumes most of the remaining time. We use a new approach: we apply prefiltering of the kind commonly used in BLAST to perform the initial all-pairs alignments. This results in a speedup of from 80× to 190× over the CPU code (8 cores) and speedup of from 2.5× to 8× over DP/FPGA- and GPU-based methods. When combined with a recently published method for phase 3, and using the original software for phase 2, the end-to-end speedup is at least 50× over an 8-core implementation of the original code. The quality is comparable to the original according to a commonly used benchmark suite evaluated with respect to multiple distance metrics.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2012 IEEE 20th International Symposium on Field-Programmable Custom Computing Machines

自引率

0.00%

发文量