Using Machine Learning to Improve Control for Confounding in the Dynamic Weighted Ordinary Least Squares Estimator of Optimal Adaptive Treatment Strategies

IF 1.8 3区生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Biometrical Journal Pub Date : 2025-07-29 DOI:10.1002/bimj.70068

Kossi Clément Trenou, Miceline Mésidor, Aida Eslami, Hermann Nabi, Caroline Diorio, Denis Talbot

{"title":"Using Machine Learning to Improve Control for Confounding in the Dynamic Weighted Ordinary Least Squares Estimator of Optimal Adaptive Treatment Strategies","authors":"Kossi Clément Trenou, Miceline Mésidor, Aida Eslami, Hermann Nabi, Caroline Diorio, Denis Talbot","doi":"10.1002/bimj.70068","DOIUrl":null,"url":null,"abstract":"Estimating optimal adaptive treatment strategies (ATSs) can be done in several ways, including dynamic weighted ordinary least squares (dWOLS). This approach is doubly robust as it requires modeling both the treatment and the response, but only one of those models needs to be correctly specified to obtain a consistent estimator. For estimating an average treatment effect, doubly robust methods have been shown to combine better with machine learning methods than alternatives. However, the use of machine learning within dWOLS has not yet been investigated. Using simulation studies, we evaluate and compare the performance of the dWOLS estimator when the treatment probability is estimated either using machine learning algorithms or a logistic regression model. We further investigate the use of an adaptive <math>\n <semantics>\n <mi>m</mi>\n <annotation>$m$</annotation>\n </semantics></math>-out-of-<math>\n <semantics>\n <mi>n</mi>\n <annotation>$n$</annotation>\n </semantics></math> bootstrap method for producing inferences. SuperLearner performed at least as well as logistic regression in terms of bias and variance in scenarios with simple data-generating models and often had improved performance in more complex scenarios. Moreover, the <math>\n <semantics>\n <mi>m</mi>\n <annotation>$m$</annotation>\n </semantics></math>-out-of-<math>\n <semantics>\n <mi>n</mi>\n <annotation>$n$</annotation>\n </semantics></math> bootstrap produced confidence intervals with nominal coverage probabilities for parameters that were estimated with low bias. We also apply our proposed approach to the data from a breast cancer registry in Québec, Canada, to estimate an optimal ATS to personalize the use of hormonal therapy in breast cancer patients. Our method is implemented in the R software and available on GitHub https://github.com/kosstre20/MachineLearningToControlConfoundingPersonalizedMedicine.git. We recommend routine use of machine learning to model treatment within dWOLS, at least as a sensitivity analysis for the point estimates.","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":"67 4","pages":""},"PeriodicalIF":1.8000,"publicationDate":"2025-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.70068","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biometrical Journal","FirstCategoryId":"99","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/bimj.70068","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Estimating optimal adaptive treatment strategies (ATSs) can be done in several ways, including dynamic weighted ordinary least squares (dWOLS). This approach is doubly robust as it requires modeling both the treatment and the response, but only one of those models needs to be correctly specified to obtain a consistent estimator. For estimating an average treatment effect, doubly robust methods have been shown to combine better with machine learning methods than alternatives. However, the use of machine learning within dWOLS has not yet been investigated. Using simulation studies, we evaluate and compare the performance of the dWOLS estimator when the treatment probability is estimated either using machine learning algorithms or a logistic regression model. We further investigate the use of an adaptive $m$ -out-of- $n$ bootstrap method for producing inferences. SuperLearner performed at least as well as logistic regression in terms of bias and variance in scenarios with simple data-generating models and often had improved performance in more complex scenarios. Moreover, the $m$ -out-of- $n$ bootstrap produced confidence intervals with nominal coverage probabilities for parameters that were estimated with low bias. We also apply our proposed approach to the data from a breast cancer registry in Québec, Canada, to estimate an optimal ATS to personalize the use of hormonal therapy in breast cancer patients. Our method is implemented in the R software and available on GitHub https://github.com/kosstre20/MachineLearningToControlConfoundingPersonalizedMedicine.git. We recommend routine use of machine learning to model treatment within dWOLS, at least as a sensitivity analysis for the point estimates.

Abstract Image

查看原文本刊更多论文

利用机器学习改进最优自适应处理策略动态加权普通最小二乘估计中对混杂的控制

估计最优自适应处理策略（ats）可以通过几种方法完成，包括动态加权普通最小二乘法（dWOLS）。这种方法具有双重鲁棒性，因为它需要对处理和响应进行建模，但是只需正确指定其中一个模型即可获得一致的估计器。对于估计平均治疗效果，双鲁棒方法已被证明比替代方法更好地与机器学习方法相结合。然而，在dWOLS中使用机器学习尚未进行调查。通过模拟研究，我们评估和比较了使用机器学习算法或逻辑回归模型估计治疗概率时dWOLS估计器的性能。我们进一步研究了使用自适应m$ m$ -out-of- n$ n$ bootstrap方法来产生推理。在使用简单数据生成模型的场景中，SuperLearner在偏差和方差方面的表现至少与逻辑回归一样好，并且在更复杂的场景中通常表现更好。此外，m$ m$ -out-of- n$ n$ bootstrap为低偏差估计的参数产生具有名义覆盖概率的置信区间。我们还将我们提出的方法应用于加拿大qusamubec的乳腺癌登记处的数据，以估计乳腺癌患者个性化使用激素治疗的最佳ATS。我们的方法是在R软件中实现的，可以在GitHub https://github.com/kosstre20/MachineLearningToControlConfoundingPersonalizedMedicine.git上获得。我们建议常规使用机器学习来模拟dWOLS中的治疗，至少作为点估计的敏感性分析。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Biometrical Journal 生物-数学与计算生物学

CiteScore

3.20

自引率

5.90%

发文量

119

审稿时长

6-12 weeks

期刊介绍： Biometrical Journal publishes papers on statistical methods and their applications in life sciences including medicine, environmental sciences and agriculture. Methodological developments should be motivated by an interesting and relevant problem from these areas. Ideally the manuscript should include a description of the problem and a section detailing the application of the new methodology to the problem. Case studies, review articles and letters to the editors are also welcome. Papers containing only extensive mathematical theory are not suitable for publication in Biometrical Journal.