Yi Guo, Huixun Jia, Ziwei Peng, Xinming Xu, Zhicheng Zhang, Keyu Pan, Yuqin Zhou, Haidong Kan, Zhenyu Wu, Cong Liu
{"title":"Advanced Bayesian kernel machine regression for large-scale exposome studies: Making the impossible possible.","authors":"Yi Guo, Huixun Jia, Ziwei Peng, Xinming Xu, Zhicheng Zhang, Keyu Pan, Yuqin Zhou, Haidong Kan, Zhenyu Wu, Cong Liu","doi":"10.1016/j.xinn.2025.101248","DOIUrl":null,"url":null,"abstract":"<p><p>Exposome studies involve analyzing numerous exposures with complex interactions and potential collinearity, presenting challenges for conventional statistical methods. While Bayesian kernel machine regression (BKMR) has emerged as a promising solution, its widespread adoption has been hindered by high computational costs and restricted interpretability. To address these critical limitations in large-scale exposome studies, we developed an advanced BKMR (A-BKMR) model. The Gaussian predictive process and matrix decomposition were used to reduce both processing time and memory requirements. Additionally, we employed the parametric g-formula to generate interpretable statistics, including joint and univariate effects as well as bivariate and multivariate interactions. Across various scenarios with different sample sizes and numbers of exposures, A-BKMR demonstrated both high computational efficiency and model performance. Previously, analyzing datasets with sample sizes of 100,000 was unfeasible for traditional BKMR. The current A-BKMR can complete such analyses in 1 h on a personal computer, making it over 700,000 times faster than conventional BKMR implementations. Additionally, A-BKMR can accurately identify important exposure while preserving an area under the curve (AUC) > 0.99 and an <i>R</i> <sup><i>2</i></sup> > 0.97 across scenarios with varying sample sizes and numbers of exposures. Furthermore, A-BKMR introduces novel quantitative metrics for effect estimates and interaction analyses, substantially enhancing interpretability. These advancements establish A-BKMR as an excellent statistical framework for future large-scale exposome studies.</p>","PeriodicalId":36121,"journal":{"name":"The Innovation","volume":"7 4","pages":"101248"},"PeriodicalIF":25.7000,"publicationDate":"2026-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13069415/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Innovation","FirstCategoryId":"95","ListUrlMain":"https://doi.org/10.1016/j.xinn.2025.101248","RegionNum":1,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2026/4/6 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
Exposome studies involve analyzing numerous exposures with complex interactions and potential collinearity, presenting challenges for conventional statistical methods. While Bayesian kernel machine regression (BKMR) has emerged as a promising solution, its widespread adoption has been hindered by high computational costs and restricted interpretability. To address these critical limitations in large-scale exposome studies, we developed an advanced BKMR (A-BKMR) model. The Gaussian predictive process and matrix decomposition were used to reduce both processing time and memory requirements. Additionally, we employed the parametric g-formula to generate interpretable statistics, including joint and univariate effects as well as bivariate and multivariate interactions. Across various scenarios with different sample sizes and numbers of exposures, A-BKMR demonstrated both high computational efficiency and model performance. Previously, analyzing datasets with sample sizes of 100,000 was unfeasible for traditional BKMR. The current A-BKMR can complete such analyses in 1 h on a personal computer, making it over 700,000 times faster than conventional BKMR implementations. Additionally, A-BKMR can accurately identify important exposure while preserving an area under the curve (AUC) > 0.99 and an R2 > 0.97 across scenarios with varying sample sizes and numbers of exposures. Furthermore, A-BKMR introduces novel quantitative metrics for effect estimates and interaction analyses, substantially enhancing interpretability. These advancements establish A-BKMR as an excellent statistical framework for future large-scale exposome studies.
期刊介绍:
The Innovation is an interdisciplinary journal that aims to promote scientific application. It publishes cutting-edge research and high-quality reviews in various scientific disciplines, including physics, chemistry, materials, nanotechnology, biology, translational medicine, geoscience, and engineering. The journal adheres to the peer review and publishing standards of Cell Press journals.
The Innovation is committed to serving scientists and the public. It aims to publish significant advances promptly and provides a transparent exchange platform. The journal also strives to efficiently promote the translation from scientific discovery to technological achievements and rapidly disseminate scientific findings worldwide.
Indexed in the following databases, The Innovation has visibility in Scopus, Directory of Open Access Journals (DOAJ), Web of Science, Emerging Sources Citation Index (ESCI), PubMed Central, Compendex (previously Ei index), INSPEC, and CABI A&I.