Beatrice Branchini, Sofia Breschi, Alberto Zeni, M. Santambrogio
{"title":"快速基因组分析利用精确的字符串匹配","authors":"Beatrice Branchini, Sofia Breschi, Alberto Zeni, M. Santambrogio","doi":"10.1109/IPDPSW55747.2022.00032","DOIUrl":null,"url":null,"abstract":"Genome assembly is one of the most challenging tasks in bioinformatics, as it is the key to many applications. One of the fundamental tasks in genome assembly is exact sequence alignment. This process enables the identification of recurrent patterns and mutations inside the DNA, which can substantially support clinicians in providing a quicker diagnosis and producing individual-specific drugs. However, this procedure represents a bottleneck in genome analysis as it is computationally intensive and time-consuming. In this scenario, the efficiency of the chosen algorithm to perform this operation also plays a crucial role to speed up the analysis process. In this paper, we present a high-performance, energy-efficient FPGA implementation of the Knuth Morris Pratt (KMP) algorithm. Our multi-core architecture can parallelize the alignment procedure of the sequences, significantly reducing the execution time while still maintaining high flexibility. Experimental results show that our implementation on a Xilinx Alveo U280 achieves up to $2.68\\times$ speedup and up to $7.46\\times$ improvement in energy efficiency against Bowtie2, a State-of-the-Art application for sequence alignment run on a 40-thread Intel Xeon processor. Finally, our design also outperforms hardware-accelerated applications of the KMP present the State of the Art by up to $19.38\\times$ and $15.63\\times$ in terms of throughput and energy efficiency respectively.","PeriodicalId":286968,"journal":{"name":"2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","volume":"98 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Fast Genome Analysis Leveraging Exact String Matching\",\"authors\":\"Beatrice Branchini, Sofia Breschi, Alberto Zeni, M. Santambrogio\",\"doi\":\"10.1109/IPDPSW55747.2022.00032\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Genome assembly is one of the most challenging tasks in bioinformatics, as it is the key to many applications. One of the fundamental tasks in genome assembly is exact sequence alignment. This process enables the identification of recurrent patterns and mutations inside the DNA, which can substantially support clinicians in providing a quicker diagnosis and producing individual-specific drugs. However, this procedure represents a bottleneck in genome analysis as it is computationally intensive and time-consuming. In this scenario, the efficiency of the chosen algorithm to perform this operation also plays a crucial role to speed up the analysis process. In this paper, we present a high-performance, energy-efficient FPGA implementation of the Knuth Morris Pratt (KMP) algorithm. Our multi-core architecture can parallelize the alignment procedure of the sequences, significantly reducing the execution time while still maintaining high flexibility. Experimental results show that our implementation on a Xilinx Alveo U280 achieves up to $2.68\\\\times$ speedup and up to $7.46\\\\times$ improvement in energy efficiency against Bowtie2, a State-of-the-Art application for sequence alignment run on a 40-thread Intel Xeon processor. Finally, our design also outperforms hardware-accelerated applications of the KMP present the State of the Art by up to $19.38\\\\times$ and $15.63\\\\times$ in terms of throughput and energy efficiency respectively.\",\"PeriodicalId\":286968,\"journal\":{\"name\":\"2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)\",\"volume\":\"98 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IPDPSW55747.2022.00032\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPSW55747.2022.00032","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Fast Genome Analysis Leveraging Exact String Matching
Genome assembly is one of the most challenging tasks in bioinformatics, as it is the key to many applications. One of the fundamental tasks in genome assembly is exact sequence alignment. This process enables the identification of recurrent patterns and mutations inside the DNA, which can substantially support clinicians in providing a quicker diagnosis and producing individual-specific drugs. However, this procedure represents a bottleneck in genome analysis as it is computationally intensive and time-consuming. In this scenario, the efficiency of the chosen algorithm to perform this operation also plays a crucial role to speed up the analysis process. In this paper, we present a high-performance, energy-efficient FPGA implementation of the Knuth Morris Pratt (KMP) algorithm. Our multi-core architecture can parallelize the alignment procedure of the sequences, significantly reducing the execution time while still maintaining high flexibility. Experimental results show that our implementation on a Xilinx Alveo U280 achieves up to $2.68\times$ speedup and up to $7.46\times$ improvement in energy efficiency against Bowtie2, a State-of-the-Art application for sequence alignment run on a 40-thread Intel Xeon processor. Finally, our design also outperforms hardware-accelerated applications of the KMP present the State of the Art by up to $19.38\times$ and $15.63\times$ in terms of throughput and energy efficiency respectively.