Ricardo Nobre, A. Ilic, Sergio Santander-Jiménez, Leonel Sousa
{"title":"gpu上张量加速的四阶上位检测","authors":"Ricardo Nobre, A. Ilic, Sergio Santander-Jiménez, Leonel Sousa","doi":"10.1145/3545008.3545066","DOIUrl":null,"url":null,"abstract":"The improved accessibility of gene sequencing technologies has led to creation of huge datasets, i.e. patient records related to certain human diseases (phenotypes). Hence, deriving fast and accurate algorithms for efficiently processing these datasets is a paramount concern to enable some key healthcare scenarios, such as personalizing treatments, explaining the occurrence of and/or susceptibility to complex conditions and reducing the spread of infectious diseases. This is especially true for high-order epistasis detection, one of the most computationally challenging problems in bioinformatics, where associations between a given phenotype and single nucleotide polymorphisms (SNPs) of a population can often only be uncovered through evaluation of a large number of SNP combinations. To tackle this challenge, we propose a novel fourth-order epistasis detection algorithm that leverages tensor processing capabilities of two distinct accelerator architectures by efficiently mapping core computations related to processing quads of SNPs to binary tensor-accelerated matrix operations. Experimental results show that the proposed approach delivers very high performance even in single-GPU environments, e.g., 27.8 and 90.9 tera quads of SNPs per second, scaled to the sample size, were processed on Titan RTX (Turing) and A100 (Ampere) PCIe GPUs, respectively. Being the first approach that exploits tensor cores for accelerating searches with interaction order above three, the proposed method achieved a performance of up to 835.4 tera quads of SNPs per second on the 8-GPU HGX A100 server, which represents performance two or more orders of magnitude higher than that of related art.","PeriodicalId":360504,"journal":{"name":"Proceedings of the 51st International Conference on Parallel Processing","volume":"184 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Tensor-Accelerated Fourth-Order Epistasis Detection on GPUs\",\"authors\":\"Ricardo Nobre, A. Ilic, Sergio Santander-Jiménez, Leonel Sousa\",\"doi\":\"10.1145/3545008.3545066\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The improved accessibility of gene sequencing technologies has led to creation of huge datasets, i.e. patient records related to certain human diseases (phenotypes). Hence, deriving fast and accurate algorithms for efficiently processing these datasets is a paramount concern to enable some key healthcare scenarios, such as personalizing treatments, explaining the occurrence of and/or susceptibility to complex conditions and reducing the spread of infectious diseases. This is especially true for high-order epistasis detection, one of the most computationally challenging problems in bioinformatics, where associations between a given phenotype and single nucleotide polymorphisms (SNPs) of a population can often only be uncovered through evaluation of a large number of SNP combinations. To tackle this challenge, we propose a novel fourth-order epistasis detection algorithm that leverages tensor processing capabilities of two distinct accelerator architectures by efficiently mapping core computations related to processing quads of SNPs to binary tensor-accelerated matrix operations. Experimental results show that the proposed approach delivers very high performance even in single-GPU environments, e.g., 27.8 and 90.9 tera quads of SNPs per second, scaled to the sample size, were processed on Titan RTX (Turing) and A100 (Ampere) PCIe GPUs, respectively. Being the first approach that exploits tensor cores for accelerating searches with interaction order above three, the proposed method achieved a performance of up to 835.4 tera quads of SNPs per second on the 8-GPU HGX A100 server, which represents performance two or more orders of magnitude higher than that of related art.\",\"PeriodicalId\":360504,\"journal\":{\"name\":\"Proceedings of the 51st International Conference on Parallel Processing\",\"volume\":\"184 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-08-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 51st International Conference on Parallel Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3545008.3545066\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 51st International Conference on Parallel Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3545008.3545066","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Tensor-Accelerated Fourth-Order Epistasis Detection on GPUs
The improved accessibility of gene sequencing technologies has led to creation of huge datasets, i.e. patient records related to certain human diseases (phenotypes). Hence, deriving fast and accurate algorithms for efficiently processing these datasets is a paramount concern to enable some key healthcare scenarios, such as personalizing treatments, explaining the occurrence of and/or susceptibility to complex conditions and reducing the spread of infectious diseases. This is especially true for high-order epistasis detection, one of the most computationally challenging problems in bioinformatics, where associations between a given phenotype and single nucleotide polymorphisms (SNPs) of a population can often only be uncovered through evaluation of a large number of SNP combinations. To tackle this challenge, we propose a novel fourth-order epistasis detection algorithm that leverages tensor processing capabilities of two distinct accelerator architectures by efficiently mapping core computations related to processing quads of SNPs to binary tensor-accelerated matrix operations. Experimental results show that the proposed approach delivers very high performance even in single-GPU environments, e.g., 27.8 and 90.9 tera quads of SNPs per second, scaled to the sample size, were processed on Titan RTX (Turing) and A100 (Ampere) PCIe GPUs, respectively. Being the first approach that exploits tensor cores for accelerating searches with interaction order above three, the proposed method achieved a performance of up to 835.4 tera quads of SNPs per second on the 8-GPU HGX A100 server, which represents performance two or more orders of magnitude higher than that of related art.