聚类生存数据的新方法：估计治疗效果异质性和变量选择

IF 1.3 3区生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Biometrical Journal Pub Date : 2023-12-10 DOI:10.1002/bimj.202200178

Liangyuan Hu

{"title":"聚类生存数据的新方法：估计治疗效果异质性和变量选择","authors":"Liangyuan Hu","doi":"10.1002/bimj.202200178","DOIUrl":null,"url":null,"abstract":"<p>We recently developed a new method random-intercept accelerated failure time model with Bayesian additive regression trees (riAFT-BART) to draw causal inferences about population treatment effect on patient survival from clustered and censored survival data while accounting for the multilevel data structure. The practical utility of this method goes beyond the estimation of population average treatment effect. In this work, we exposit how riAFT-BART can be used to solve two important statistical questions with clustered survival data: estimating the treatment effect heterogeneity and variable selection. Leveraging the likelihood-based machine learning, we describe a way in which we can draw posterior samples of the individual survival treatment effect from riAFT-BART model runs, and use the drawn posterior samples to perform an exploratory treatment effect heterogeneity analysis to identify subpopulations who may experience differential treatment effects than population average effects. There is sparse literature on methods for variable selection among clustered and censored survival data, particularly ones using flexible modeling techniques. We propose a permutation-based approach using the predictor's variable inclusion proportion supplied by the riAFT-BART model for variable selection. To address the missing data issue frequently encountered in health databases, we propose a strategy to combine bootstrap imputation and riAFT-BART for variable selection among incomplete clustered survival data. We conduct an expansive simulation study to examine the practical operating characteristics of our proposed methods, and provide empirical evidence that our proposed methods perform better than several existing methods across a wide range of data scenarios. Finally, we demonstrate the methods via a case study of predictors for in-hospital mortality among severe COVID-19 patients and estimating the heterogeneous treatment effects of three COVID-specific medications. The methods developed in this work are readily available in the <math>\n <semantics>\n <mi>R</mi>\n <annotation>${\\textsf {R}}$</annotation>\n </semantics></math> package <math>\n <semantics>\n <mi>riAFTBART</mi>\n <annotation>$\\textsf {riAFTBART}$</annotation>\n </semantics></math>.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":null,"pages":null},"PeriodicalIF":1.3000,"publicationDate":"2023-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.202200178","citationCount":"0","resultStr":"{\"title\":\"A new method for clustered survival data: Estimation of treatment effect heterogeneity and variable selection\",\"authors\":\"Liangyuan Hu\",\"doi\":\"10.1002/bimj.202200178\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>We recently developed a new method random-intercept accelerated failure time model with Bayesian additive regression trees (riAFT-BART) to draw causal inferences about population treatment effect on patient survival from clustered and censored survival data while accounting for the multilevel data structure. The practical utility of this method goes beyond the estimation of population average treatment effect. In this work, we exposit how riAFT-BART can be used to solve two important statistical questions with clustered survival data: estimating the treatment effect heterogeneity and variable selection. Leveraging the likelihood-based machine learning, we describe a way in which we can draw posterior samples of the individual survival treatment effect from riAFT-BART model runs, and use the drawn posterior samples to perform an exploratory treatment effect heterogeneity analysis to identify subpopulations who may experience differential treatment effects than population average effects. There is sparse literature on methods for variable selection among clustered and censored survival data, particularly ones using flexible modeling techniques. We propose a permutation-based approach using the predictor's variable inclusion proportion supplied by the riAFT-BART model for variable selection. To address the missing data issue frequently encountered in health databases, we propose a strategy to combine bootstrap imputation and riAFT-BART for variable selection among incomplete clustered survival data. We conduct an expansive simulation study to examine the practical operating characteristics of our proposed methods, and provide empirical evidence that our proposed methods perform better than several existing methods across a wide range of data scenarios. Finally, we demonstrate the methods via a case study of predictors for in-hospital mortality among severe COVID-19 patients and estimating the heterogeneous treatment effects of three COVID-specific medications. The methods developed in this work are readily available in the <math>\\n <semantics>\\n <mi>R</mi>\\n <annotation>${\\\\textsf {R}}$</annotation>\\n </semantics></math> package <math>\\n <semantics>\\n <mi>riAFTBART</mi>\\n <annotation>$\\\\textsf {riAFTBART}$</annotation>\\n </semantics></math>.</p>\",\"PeriodicalId\":55360,\"journal\":{\"name\":\"Biometrical Journal\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":1.3000,\"publicationDate\":\"2023-12-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.202200178\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Biometrical Journal\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/bimj.202200178\",\"RegionNum\":3,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"MATHEMATICAL & COMPUTATIONAL BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biometrical Journal","FirstCategoryId":"99","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/bimj.202200178","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

我们最近开发了一种新方法--贝叶斯加性回归树随机截距加速失败时间模型（riAFT-BART），用于从聚类和删减的生存数据中得出人群治疗效果对患者生存的因果推论，同时考虑到多层次数据结构。这种方法的实际效用不仅限于估计人群平均治疗效果。在这项工作中，我们阐述了 riAFT-BART 如何用于解决聚类生存数据的两个重要统计问题：估计治疗效果异质性和变量选择。利用基于似然法的机器学习，我们描述了一种从 riAFT-BART 模型运行中抽取个体生存治疗效果后验样本的方法，并利用抽取的后验样本进行探索性治疗效果异质性分析，以识别可能经历不同于人群平均治疗效果的亚人群。有关聚类和删减生存数据中变量选择方法的文献很少，尤其是使用灵活建模技术的方法。我们提出了一种基于置换的方法，利用 riAFT-BART 模型提供的预测变量包含比例进行变量选择。为了解决健康数据库中经常遇到的数据缺失问题，我们提出了一种策略，将自举估算和 riAFT-BART 结合起来，在不完整的聚类生存数据中进行变量选择。我们进行了广泛的模拟研究，以检验我们提出的方法的实际操作特性，并提供了经验证据，证明我们提出的方法在各种数据情况下的表现优于现有的几种方法。最后，我们通过对严重 COVID-19 患者院内死亡率预测因素的案例研究，以及对三种 COVID 特定药物的异质性治疗效果的估算，展示了我们的方法。本研究中开发的方法可在 R${textsf {R}}$ 软件包 riAFTBART$\textsf {riAFTBART}$ 中随时使用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

A new method for clustered survival data: Estimation of treatment effect heterogeneity and variable selection

查看原文本刊更多论文

A new method for clustered survival data: Estimation of treatment effect heterogeneity and variable selection

We recently developed a new method random-intercept accelerated failure time model with Bayesian additive regression trees (riAFT-BART) to draw causal inferences about population treatment effect on patient survival from clustered and censored survival data while accounting for the multilevel data structure. The practical utility of this method goes beyond the estimation of population average treatment effect. In this work, we exposit how riAFT-BART can be used to solve two important statistical questions with clustered survival data: estimating the treatment effect heterogeneity and variable selection. Leveraging the likelihood-based machine learning, we describe a way in which we can draw posterior samples of the individual survival treatment effect from riAFT-BART model runs, and use the drawn posterior samples to perform an exploratory treatment effect heterogeneity analysis to identify subpopulations who may experience differential treatment effects than population average effects. There is sparse literature on methods for variable selection among clustered and censored survival data, particularly ones using flexible modeling techniques. We propose a permutation-based approach using the predictor's variable inclusion proportion supplied by the riAFT-BART model for variable selection. To address the missing data issue frequently encountered in health databases, we propose a strategy to combine bootstrap imputation and riAFT-BART for variable selection among incomplete clustered survival data. We conduct an expansive simulation study to examine the practical operating characteristics of our proposed methods, and provide empirical evidence that our proposed methods perform better than several existing methods across a wide range of data scenarios. Finally, we demonstrate the methods via a case study of predictors for in-hospital mortality among severe COVID-19 patients and estimating the heterogeneous treatment effects of three COVID-specific medications. The methods developed in this work are readily available in the $R$ package $riAFTBART$ .

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Biometrical Journal 生物-数学与计算生物学

CiteScore

3.20

自引率

5.90%

发文量

119

审稿时长

6-12 weeks

期刊介绍： Biometrical Journal publishes papers on statistical methods and their applications in life sciences including medicine, environmental sciences and agriculture. Methodological developments should be motivated by an interesting and relevant problem from these areas. Ideally the manuscript should include a description of the problem and a section detailing the application of the new methodology to the problem. Case studies, review articles and letters to the editors are also welcome. Papers containing only extensive mathematical theory are not suitable for publication in Biometrical Journal.