{"title":"HiBGT: High-Performance Bayesian Group Testing for COVID-19","authors":"Weicong Chen, C. Tatsuoka, Xiaoyi Lu","doi":"10.1109/HiPC56025.2022.00033","DOIUrl":null,"url":null,"abstract":"The COVID-19 pandemic has necessitated disease surveillance using group testing. Novel Bayesian methods using lattice models were proposed, which offer substantial improvements in group testing efficiency by precisely quantifying uncertainty in diagnoses, acknowledging varying individual risk and dilution effects, and guiding optimally convergent sequential pooled test selections. Computationally, however, Bayesian group testing poses considerable challenges as computational complexity grows exponentially with sample size. HPC and big data stacks are needed for assessing computational and statistical performance across fluctuating prevalence levels at large scales. Here, we study how to design and optimize critical computational components of Bayesian group testing, including lattice model representation, test selection algorithms, and statistical analysis schemes, under the context of parallel computing. To realize this, we propose a high-performance Bayesian group testing framework named HiBGT, based on Apache Spark, which systematically explores the design space of Bayesian group testing and provides comprehensive heuristics on how to achieve high-performance, highly scalable Bayesian group testing. We show that HiBGT can perform large-scale test selections (> 250 state iterations) and accelerate statistical analyzes up to 15.9x (up to 363x with little trade-offs) through a varied selection of sophisticated parallel computing techniques while achieving near linear scalability using up to 924 CPU cores.","PeriodicalId":119363,"journal":{"name":"2022 IEEE 29th International Conference on High Performance Computing, Data, and Analytics (HiPC)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 29th International Conference on High Performance Computing, Data, and Analytics (HiPC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HiPC56025.2022.00033","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
The COVID-19 pandemic has necessitated disease surveillance using group testing. Novel Bayesian methods using lattice models were proposed, which offer substantial improvements in group testing efficiency by precisely quantifying uncertainty in diagnoses, acknowledging varying individual risk and dilution effects, and guiding optimally convergent sequential pooled test selections. Computationally, however, Bayesian group testing poses considerable challenges as computational complexity grows exponentially with sample size. HPC and big data stacks are needed for assessing computational and statistical performance across fluctuating prevalence levels at large scales. Here, we study how to design and optimize critical computational components of Bayesian group testing, including lattice model representation, test selection algorithms, and statistical analysis schemes, under the context of parallel computing. To realize this, we propose a high-performance Bayesian group testing framework named HiBGT, based on Apache Spark, which systematically explores the design space of Bayesian group testing and provides comprehensive heuristics on how to achieve high-performance, highly scalable Bayesian group testing. We show that HiBGT can perform large-scale test selections (> 250 state iterations) and accelerate statistical analyzes up to 15.9x (up to 363x with little trade-offs) through a varied selection of sophisticated parallel computing techniques while achieving near linear scalability using up to 924 CPU cores.