A Tree-based Model Averaging Approach for Personalized Treatment Effect Estimation from Heterogeneous Data Sources.

Proceedings of machine learning research Pub Date : 2022-07-01

Xiaoqing Tan, Chung-Chou H Chang, Ling Zhou, Lu Tang

{"title":"A Tree-based Model Averaging Approach for Personalized Treatment Effect Estimation from Heterogeneous Data Sources.","authors":"Xiaoqing Tan, Chung-Chou H Chang, Ling Zhou, Lu Tang","doi":"","DOIUrl":null,"url":null,"abstract":"<p><p>Accurately estimating personalized treatment effects within a study site (e.g., a hospital) has been challenging due to limited sample size. Furthermore, privacy considerations and lack of resources prevent a site from leveraging subject-level data from other sites. We propose a tree-based model averaging approach to improve the estimation accuracy of conditional average treatment effects (CATE) at a target site by leveraging models derived from other potentially heterogeneous sites, without them sharing subject-level data. To our best knowledge, there is no established model averaging approach for distributed data with a focus on improving the estimation of treatment effects. Specifically, under distributed data networks, our framework provides an interpretable tree-based ensemble of CATE estimators that joins models across study sites, while actively modeling the heterogeneity in data sources through site partitioning. The performance of this approach is demonstrated by a real-world study of the causal effects of oxygen therapy on hospital survival rate and backed up by comprehensive simulation results.</p>","PeriodicalId":74504,"journal":{"name":"Proceedings of machine learning research","volume":"162 ","pages":"21013-21036"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10711748/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of machine learning research","FirstCategoryId":"1085","ListUrlMain":"","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Accurately estimating personalized treatment effects within a study site (e.g., a hospital) has been challenging due to limited sample size. Furthermore, privacy considerations and lack of resources prevent a site from leveraging subject-level data from other sites. We propose a tree-based model averaging approach to improve the estimation accuracy of conditional average treatment effects (CATE) at a target site by leveraging models derived from other potentially heterogeneous sites, without them sharing subject-level data. To our best knowledge, there is no established model averaging approach for distributed data with a focus on improving the estimation of treatment effects. Specifically, under distributed data networks, our framework provides an interpretable tree-based ensemble of CATE estimators that joins models across study sites, while actively modeling the heterogeneity in data sources through site partitioning. The performance of this approach is demonstrated by a real-world study of the causal effects of oxygen therapy on hospital survival rate and backed up by comprehensive simulation results.

本刊更多论文

一种基于树状模型的平均方法，用于从异构数据源中估计个性化治疗效果。

由于样本量有限，在研究机构（如医院）内准确估计个性化治疗效果一直是个挑战。此外，出于隐私考虑和资源不足，研究机构无法利用其他研究机构的受试者数据。我们提出了一种基于树的模型平均方法，通过利用其他潜在异质性研究机构的模型来提高目标研究机构条件平均治疗效果（CATE）的估计精度，而无需共享受试者数据。据我们所知，目前还没有一种成熟的分布式数据模型平均化方法，可以改善治疗效果的估计。具体来说，在分布式数据网络下，我们的框架提供了一种可解释的基于树状结构的 CATE 估计器集合，它可以连接各研究站点的模型，同时通过站点分区对数据源的异质性进行积极建模。氧气疗法对医院存活率的因果效应的实际研究证明了这种方法的性能，并得到了全面模拟结果的支持。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of machine learning research

自引率

0.00%

发文量