Investigating Memory Optimization of Hash-index for Next Generation Sequencing on Multi-core Architecture

2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum Pub Date : 2012-05-21 DOI:10.1109/IPDPSW.2012.83

Wendi Wang, Wen Tang, Linchuan Li, Guangming Tan, Peiheng Zhang, Ninghui Sun

{"title":"Investigating Memory Optimization of Hash-index for Next Generation Sequencing on Multi-core Architecture","authors":"Wendi Wang, Wen Tang, Linchuan Li, Guangming Tan, Peiheng Zhang, Ninghui Sun","doi":"10.1109/IPDPSW.2012.83","DOIUrl":null,"url":null,"abstract":"Next Generation Sequencing (NGS) is gaining interests due to the increased requirements and the decreased sequencing cost. The important and prerequisite step of most NGS applications is the mapping of short sequences, called reads, to the template reference sequences. Both the explosion of NGS data with over billions of reads generated each day and the data intensive computations pose great challenges to the capability of existing computing systems. In this paper, we take a hash index based algorithm (PerM) as an example to investigate the optimization approaches for accelerating NGS reads mapping on multi-core architectures. First, we propose a new parallel algorithm that reorders bucket access in hash index among multiple threads so that data locality in shared cache is improved. Second, in order to reduce the number of empty hash bucket, we propose a serialized hash index compression algorithm, which coincides with the sequential access nature of our new parallel algorithm. With reduced hash index size, it also becomes possible for us to use longer hash keys, which alleviates the hash conflicts and improves the query performance. Our experiment on an 8-socket 8-cores Intel Xeon X7550 SMP with 128 GB memory shows that the new parallel algorithm reduces LLC miss ratio to be 8%~15% of the original algorithm and the overall performance is improved by 4~11 times (6 times avg.).","PeriodicalId":378335,"journal":{"name":"2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPSW.2012.83","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 10

Abstract

Next Generation Sequencing (NGS) is gaining interests due to the increased requirements and the decreased sequencing cost. The important and prerequisite step of most NGS applications is the mapping of short sequences, called reads, to the template reference sequences. Both the explosion of NGS data with over billions of reads generated each day and the data intensive computations pose great challenges to the capability of existing computing systems. In this paper, we take a hash index based algorithm (PerM) as an example to investigate the optimization approaches for accelerating NGS reads mapping on multi-core architectures. First, we propose a new parallel algorithm that reorders bucket access in hash index among multiple threads so that data locality in shared cache is improved. Second, in order to reduce the number of empty hash bucket, we propose a serialized hash index compression algorithm, which coincides with the sequential access nature of our new parallel algorithm. With reduced hash index size, it also becomes possible for us to use longer hash keys, which alleviates the hash conflicts and improves the query performance. Our experiment on an 8-socket 8-cores Intel Xeon X7550 SMP with 128 GB memory shows that the new parallel algorithm reduces LLC miss ratio to be 8%~15% of the original algorithm and the overall performance is improved by 4~11 times (6 times avg.).

查看原文本刊更多论文

面向下一代多核排序的哈希索引内存优化研究

由于测序需求的增加和测序成本的降低，下一代测序(NGS)正受到人们的关注。大多数NGS应用程序的重要和先决步骤是将短序列(称为reads)映射到模板参考序列。每天读取超过数十亿次的NGS数据的爆炸式增长和数据密集型计算对现有计算系统的能力构成了巨大挑战。本文以基于哈希索引的算法(PerM)为例，研究了在多核架构下加速NGS读映射的优化方法。首先，我们提出了一种新的并行算法，该算法在多个线程之间重新排序哈希索引中的桶访问，从而提高共享缓存中的数据局部性。其次，为了减少空哈希桶的数量，我们提出了一种序列化哈希索引压缩算法，这与我们新的并行算法的顺序访问性质相吻合。通过减少哈希索引大小，我们还可以使用更长的哈希键，这可以缓解哈希冲突并提高查询性能。在内存为128 GB的8插槽8核Intel Xeon X7550 SMP上进行的实验表明，新并行算法将LLC失误率降低到原算法的8%~15%，整体性能提高了4~11倍(平均6倍)。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum

自引率

0.00%

发文量