Daniel Rud, Md Mostafijur Rahman, Anny H Xiang, Rob McConnell, Fred Lurmann, Michael J Kleeman, Joel Schwartz, Zhanghua Chen, Sandy Eckel, Juan Pablo Lewinger
{"title":"Frequentist Grouped Weighted Quantile Sum Regression for Correlated Chemical Mixtures.","authors":"Daniel Rud, Md Mostafijur Rahman, Anny H Xiang, Rob McConnell, Fred Lurmann, Michael J Kleeman, Joel Schwartz, Zhanghua Chen, Sandy Eckel, Juan Pablo Lewinger","doi":"10.1002/sim.70078","DOIUrl":null,"url":null,"abstract":"<p><p>As individuals are exposed to a myriad of potentially harmful pollutants every day, it is important to determine which actors have the greatest influence on health outcomes. However, jointly modeling the associations of multiple pollutant exposures is often hindered by the presence of highly correlated chemicals originating from a common source. A popular approach to analyzing associations between a disease outcome and several highly correlated exposures is Weighted Quantile Sum Regression (WQSR) modeling. WQSR provides increased stability in estimating model parameters but requires data splitting to estimate individual and group effects of chemicals, which reduces the power of the approach. A recent Bayesian implementation of WQSR regression provides a model fitting procedure that avoids data splitting at the cost of high computational expense on large data. In this paper, we introduce a Frequentist Grouped Weighted Quantile Sum Regression (FGWQSR) model that can be fitted efficiently to large datasets without requiring data splitting. FGWQSR produces estimates of the joint effect of mixture groups and of individual chemicals, and likelihood-ratio-based tests that account for FGWQSR's non-standard asymptotics. We demonstrate that FGWQSR is well calibrated for type-I errors while outperforming both Bayesian Grouped Weighted Quantile Sum Regression and Quantile Logistic Regression in terms of statistical power to detect the effects of mixture groups and individual chemicals. In addition, we show that FGWQSR is robust to model misspecification and can be fitted on large datasets in a fraction of the time required for BGWQSR. We apply FGWQSR to a dataset of 317 767 mother-child pairs with exposure profiles generated by chemical transport models to study the associations between several components found in particulate matter with an aerodynamic diameter smaller than 2.5 <math> <semantics><mrow><mi>μ</mi> <mi>m</mi></mrow> <annotation>$$ \\mu \\mathrm{m} $$</annotation></semantics> </math> (PM <math> <semantics> <mrow><msub><mo> </mo> <mrow><mn>2</mn> <mo>.</mo> <mn>5</mn></mrow> </msub> </mrow> <annotation>$$ {}_{2.5} $$</annotation></semantics> </math> ) and child Autism Spectrum Disorder (ASD) diagnosis before age 5. PM <math> <semantics> <mrow><msub><mo> </mo> <mrow><mn>2</mn> <mo>.</mo> <mn>5</mn></mrow> </msub> </mrow> <annotation>$$ {}_{2.5} $$</annotation></semantics> </math> copper and PM <math> <semantics> <mrow><msub><mo> </mo> <mrow><mn>2</mn> <mo>.</mo> <mn>5</mn></mrow> </msub> </mrow> <annotation>$$ {}_{2.5} $$</annotation></semantics> </math> crustal material are found to be statistically significantly associated with ASD diagnosis by five years of age.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 7","pages":"e70078"},"PeriodicalIF":1.8000,"publicationDate":"2025-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11987061/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Statistics in Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1002/sim.70078","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
As individuals are exposed to a myriad of potentially harmful pollutants every day, it is important to determine which actors have the greatest influence on health outcomes. However, jointly modeling the associations of multiple pollutant exposures is often hindered by the presence of highly correlated chemicals originating from a common source. A popular approach to analyzing associations between a disease outcome and several highly correlated exposures is Weighted Quantile Sum Regression (WQSR) modeling. WQSR provides increased stability in estimating model parameters but requires data splitting to estimate individual and group effects of chemicals, which reduces the power of the approach. A recent Bayesian implementation of WQSR regression provides a model fitting procedure that avoids data splitting at the cost of high computational expense on large data. In this paper, we introduce a Frequentist Grouped Weighted Quantile Sum Regression (FGWQSR) model that can be fitted efficiently to large datasets without requiring data splitting. FGWQSR produces estimates of the joint effect of mixture groups and of individual chemicals, and likelihood-ratio-based tests that account for FGWQSR's non-standard asymptotics. We demonstrate that FGWQSR is well calibrated for type-I errors while outperforming both Bayesian Grouped Weighted Quantile Sum Regression and Quantile Logistic Regression in terms of statistical power to detect the effects of mixture groups and individual chemicals. In addition, we show that FGWQSR is robust to model misspecification and can be fitted on large datasets in a fraction of the time required for BGWQSR. We apply FGWQSR to a dataset of 317 767 mother-child pairs with exposure profiles generated by chemical transport models to study the associations between several components found in particulate matter with an aerodynamic diameter smaller than 2.5 (PM ) and child Autism Spectrum Disorder (ASD) diagnosis before age 5. PM copper and PM crustal material are found to be statistically significantly associated with ASD diagnosis by five years of age.
期刊介绍:
The journal aims to influence practice in medicine and its associated sciences through the publication of papers on statistical and other quantitative methods. Papers will explain new methods and demonstrate their application, preferably through a substantive, real, motivating example or a comprehensive evaluation based on an illustrative example. Alternatively, papers will report on case-studies where creative use or technical generalizations of established methodology is directed towards a substantive application. Reviews of, and tutorials on, general topics relevant to the application of statistics to medicine will also be published. The main criteria for publication are appropriateness of the statistical methods to a particular medical problem and clarity of exposition. Papers with primarily mathematical content will be excluded. The journal aims to enhance communication between statisticians, clinicians and medical researchers.