Sitao Huang, G. Manikandan, Anand Ramachandran, K. Rupnow, Wen-mei W. Hwu, Deming Chen
{"title":"Acceleration of the Pair-HMM Algorithm for DNA Variant Calling","authors":"Sitao Huang, G. Manikandan, Anand Ramachandran, K. Rupnow, Wen-mei W. Hwu, Deming Chen","doi":"10.1145/3020078.3021749","DOIUrl":null,"url":null,"abstract":"In this project, we propose an SoC solution to accelerate the Pair-HMM's forward algorithm which is the key performance bottleneck in the GATK's HaplotypeCaller tool for DNA variant calling. We develop two versions of the Pair-HMM accelerator: one using High Level Synthesis (HLS), and another ring-based manual RTL implementation. We investigate the performance of the manual RTL design and HLS design in terms of design flexibility and overall run-time. We achieve a significant speed-up of up to 19x through the HLS implementation and speed-up of up to 95x through the RTL implementation of the algorithm.","PeriodicalId":113498,"journal":{"name":"2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)","volume":"103 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"46","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3020078.3021749","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 46
Abstract
In this project, we propose an SoC solution to accelerate the Pair-HMM's forward algorithm which is the key performance bottleneck in the GATK's HaplotypeCaller tool for DNA variant calling. We develop two versions of the Pair-HMM accelerator: one using High Level Synthesis (HLS), and another ring-based manual RTL implementation. We investigate the performance of the manual RTL design and HLS design in terms of design flexibility and overall run-time. We achieve a significant speed-up of up to 19x through the HLS implementation and speed-up of up to 95x through the RTL implementation of the algorithm.