Hatem Ltaief, Rabab Alomairy, Qinglei Cao, Jie Ren, Lotfi Slim, Thorsten Kurth, Benedikt Dorschner, Salim Bougouffa, Rached Abdelkhalak, David E. Keyes
{"title":"Toward Capturing Genetic Epistasis From Multivariate Genome-Wide Association Studies Using Mixed-Precision Kernel Ridge Regression","authors":"Hatem Ltaief, Rabab Alomairy, Qinglei Cao, Jie Ren, Lotfi Slim, Thorsten Kurth, Benedikt Dorschner, Salim Bougouffa, Rached Abdelkhalak, David E. Keyes","doi":"arxiv-2409.01712","DOIUrl":null,"url":null,"abstract":"We exploit the widening margin in tensor-core performance between\n[FP64/FP32/FP16/INT8,FP64/FP32/FP16/FP8/INT8] on NVIDIA [Ampere,Hopper] GPUs to\nboost the performance of output accuracy-preserving mixed-precision computation\nof Genome-Wide Association Studies (GWAS) of 305K patients from the UK BioBank,\nthe largest-ever GWAS cohort studied for genetic epistasis using a multivariate\napproach. Tile-centric adaptive-precision linear algebraic techniques motivated\nby reducing data motion gain enhanced significance with low-precision GPU\narithmetic. At the core of Kernel Ridge Regression (KRR) techniques for GWAS\nlie compute-bound cubic-complexity matrix operations that inhibit scaling to\naspirational dimensions of the population, genotypes, and phenotypes. We\naccelerate KRR matrix generation by redesigning the computation for Euclidean\ndistances to engage INT8 tensor cores while exploiting symmetry.We accelerate\nsolution of the regularized KRR systems by deploying a new four-precision\nCholesky-based solver, which, at 1.805 mixed-precision ExaOp/s on a nearly full\nAlps system, outperforms the state-of-the-art CPU-only REGENIE GWAS software by\nfive orders of magnitude.","PeriodicalId":501070,"journal":{"name":"arXiv - QuanBio - Genomics","volume":"48 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuanBio - Genomics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.01712","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
We exploit the widening margin in tensor-core performance between
[FP64/FP32/FP16/INT8,FP64/FP32/FP16/FP8/INT8] on NVIDIA [Ampere,Hopper] GPUs to
boost the performance of output accuracy-preserving mixed-precision computation
of Genome-Wide Association Studies (GWAS) of 305K patients from the UK BioBank,
the largest-ever GWAS cohort studied for genetic epistasis using a multivariate
approach. Tile-centric adaptive-precision linear algebraic techniques motivated
by reducing data motion gain enhanced significance with low-precision GPU
arithmetic. At the core of Kernel Ridge Regression (KRR) techniques for GWAS
lie compute-bound cubic-complexity matrix operations that inhibit scaling to
aspirational dimensions of the population, genotypes, and phenotypes. We
accelerate KRR matrix generation by redesigning the computation for Euclidean
distances to engage INT8 tensor cores while exploiting symmetry.We accelerate
solution of the regularized KRR systems by deploying a new four-precision
Cholesky-based solver, which, at 1.805 mixed-precision ExaOp/s on a nearly full
Alps system, outperforms the state-of-the-art CPU-only REGENIE GWAS software by
five orders of magnitude.