Shaoliang Peng, Xiangke Liao, Canqun Yang, Yutong Lu, Jie Liu, Yingbo Cui, Heng Wang, Chengkun Wu, Bingqiang Wang
{"title":"基因组大数据分析软件在TH-2超级计算机上的扩展挑战","authors":"Shaoliang Peng, Xiangke Liao, Canqun Yang, Yutong Lu, Jie Liu, Yingbo Cui, Heng Wang, Chengkun Wu, Bingqiang Wang","doi":"10.1109/CCGrid.2015.46","DOIUrl":null,"url":null,"abstract":"Whole genome re-sequencing plays a crucial role in biomedical studies. The emergence of genomic big data calls for an enormous amount of computing power. However, current computational methods are inefficient in utilizing available computational resources. In this paper, we address this challenge by optimizing the utilization of the fastest supercomputer in the world - TH-2 supercomputer. TH-2 is featured by its neo-heterogeneous architecture, in which each compute node is equipped with 2 Intel Xeon CPUs and 3 Intel Xeon Phi coprocessors. The heterogeneity and the massive amount of data to be processed pose great challenges for the deployment of the genome analysis software pipeline on TH-2. Runtime profiling shows that SOAP3-dp and SOAPsnp are the most time-consuming components (up to 70% of total runtime) in a typical genome-analyzing pipeline. To optimize the whole pipeline, we first devise a number of parallel and optimization strategies for SOAP3-dp and SOAPsnp, respectively targeting each node to fully utilize all sorts of hardware resources provided both by CPU and MIC. We also employ a few scaling methods to reduce communication between different nodes. We then scaled up our method on TH-2. With 8192 nodes, the whole analyzing procedure took 8.37 hours to finish the analysis of a 300 TB dataset of whole genome sequences from 2,000 human beings, which can take as long as 8 months on a commodity server. The speedup is about 700x.","PeriodicalId":6664,"journal":{"name":"2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"8 1","pages":"823-828"},"PeriodicalIF":0.0000,"publicationDate":"2015-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"The Challenge of Scaling Genome Big Data Analysis Software on TH-2 Supercomputer\",\"authors\":\"Shaoliang Peng, Xiangke Liao, Canqun Yang, Yutong Lu, Jie Liu, Yingbo Cui, Heng Wang, Chengkun Wu, Bingqiang Wang\",\"doi\":\"10.1109/CCGrid.2015.46\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Whole genome re-sequencing plays a crucial role in biomedical studies. The emergence of genomic big data calls for an enormous amount of computing power. However, current computational methods are inefficient in utilizing available computational resources. In this paper, we address this challenge by optimizing the utilization of the fastest supercomputer in the world - TH-2 supercomputer. TH-2 is featured by its neo-heterogeneous architecture, in which each compute node is equipped with 2 Intel Xeon CPUs and 3 Intel Xeon Phi coprocessors. The heterogeneity and the massive amount of data to be processed pose great challenges for the deployment of the genome analysis software pipeline on TH-2. Runtime profiling shows that SOAP3-dp and SOAPsnp are the most time-consuming components (up to 70% of total runtime) in a typical genome-analyzing pipeline. To optimize the whole pipeline, we first devise a number of parallel and optimization strategies for SOAP3-dp and SOAPsnp, respectively targeting each node to fully utilize all sorts of hardware resources provided both by CPU and MIC. We also employ a few scaling methods to reduce communication between different nodes. We then scaled up our method on TH-2. With 8192 nodes, the whole analyzing procedure took 8.37 hours to finish the analysis of a 300 TB dataset of whole genome sequences from 2,000 human beings, which can take as long as 8 months on a commodity server. The speedup is about 700x.\",\"PeriodicalId\":6664,\"journal\":{\"name\":\"2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing\",\"volume\":\"8 1\",\"pages\":\"823-828\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-05-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CCGrid.2015.46\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCGrid.2015.46","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
The Challenge of Scaling Genome Big Data Analysis Software on TH-2 Supercomputer
Whole genome re-sequencing plays a crucial role in biomedical studies. The emergence of genomic big data calls for an enormous amount of computing power. However, current computational methods are inefficient in utilizing available computational resources. In this paper, we address this challenge by optimizing the utilization of the fastest supercomputer in the world - TH-2 supercomputer. TH-2 is featured by its neo-heterogeneous architecture, in which each compute node is equipped with 2 Intel Xeon CPUs and 3 Intel Xeon Phi coprocessors. The heterogeneity and the massive amount of data to be processed pose great challenges for the deployment of the genome analysis software pipeline on TH-2. Runtime profiling shows that SOAP3-dp and SOAPsnp are the most time-consuming components (up to 70% of total runtime) in a typical genome-analyzing pipeline. To optimize the whole pipeline, we first devise a number of parallel and optimization strategies for SOAP3-dp and SOAPsnp, respectively targeting each node to fully utilize all sorts of hardware resources provided both by CPU and MIC. We also employ a few scaling methods to reduce communication between different nodes. We then scaled up our method on TH-2. With 8192 nodes, the whole analyzing procedure took 8.37 hours to finish the analysis of a 300 TB dataset of whole genome sequences from 2,000 human beings, which can take as long as 8 months on a commodity server. The speedup is about 700x.