Toshiaki Komura, Falco J. Bargagli-Stoffi, Koichiro Shiba, Kosuke Inoue
{"title":"Two-step pragmatic subgroup discovery for heterogeneous treatment effects analyses: perspectives toward enhanced interpretability","authors":"Toshiaki Komura, Falco J. Bargagli-Stoffi, Koichiro Shiba, Kosuke Inoue","doi":"10.1007/s10654-025-01215-y","DOIUrl":null,"url":null,"abstract":"<p>Effect heterogeneity analyses using causal machine learning algorithms have gained popularity in recent years. However, the interpretation of estimated individualized effects requires caution because insights from these data-driven approaches might be misaligned with the contextual needs of a human audience. Thus, a <i>practical framework</i> that integrates advanced machine learning methods and decision-making remains critically needed to achieve effective implementation and scientific communication. We introduce a 2-step framework to identify characteristics associated with substantial effect heterogeneity in a practically relevant format. The proposed framework applies distinct sets of covariates for (i) estimation of individualized effects and (ii) subgroup discovery and shows the subgroups with heterogeneity based on highly interpretable if-then rules. By referring to existing metrics of interpretability, we describe how each step contributes to leveraging a theoretical advantage of machine learning models while creating an interpretable and practically relevant framework. We applied the pragmatic subgroup discovery framework for the Look AHEAD (Action for Health in Diabetes) trial to assess practically relevant and comprehensive insights into the effect heterogeneities of intense lifestyle intervention for individuals with diabetes on cardiovascular mortality. Our analysis identified (i) individuals with history of cardiovascular disease and myocardial infarction had the least benefit from the intervention, while (ii) individuals with no history of cardiovascular diseases and HbA1c < 7% received the highest benefit. In summary, our practical framework for heterogeneous effects discovery could be a generic strategy to ensure both effective implementation and scientific communication when applying machine learning algorithms in epidemiological research.</p>","PeriodicalId":11907,"journal":{"name":"European Journal of Epidemiology","volume":"23 1","pages":""},"PeriodicalIF":5.9000,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Journal of Epidemiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s10654-025-01215-y","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}
引用次数: 0
Abstract
Effect heterogeneity analyses using causal machine learning algorithms have gained popularity in recent years. However, the interpretation of estimated individualized effects requires caution because insights from these data-driven approaches might be misaligned with the contextual needs of a human audience. Thus, a practical framework that integrates advanced machine learning methods and decision-making remains critically needed to achieve effective implementation and scientific communication. We introduce a 2-step framework to identify characteristics associated with substantial effect heterogeneity in a practically relevant format. The proposed framework applies distinct sets of covariates for (i) estimation of individualized effects and (ii) subgroup discovery and shows the subgroups with heterogeneity based on highly interpretable if-then rules. By referring to existing metrics of interpretability, we describe how each step contributes to leveraging a theoretical advantage of machine learning models while creating an interpretable and practically relevant framework. We applied the pragmatic subgroup discovery framework for the Look AHEAD (Action for Health in Diabetes) trial to assess practically relevant and comprehensive insights into the effect heterogeneities of intense lifestyle intervention for individuals with diabetes on cardiovascular mortality. Our analysis identified (i) individuals with history of cardiovascular disease and myocardial infarction had the least benefit from the intervention, while (ii) individuals with no history of cardiovascular diseases and HbA1c < 7% received the highest benefit. In summary, our practical framework for heterogeneous effects discovery could be a generic strategy to ensure both effective implementation and scientific communication when applying machine learning algorithms in epidemiological research.
期刊介绍:
The European Journal of Epidemiology, established in 1985, is a peer-reviewed publication that provides a platform for discussions on epidemiology in its broadest sense. It covers various aspects of epidemiologic research and statistical methods. The journal facilitates communication between researchers, educators, and practitioners in epidemiology, including those in clinical and community medicine. Contributions from diverse fields such as public health, preventive medicine, clinical medicine, health economics, and computational biology and data science, in relation to health and disease, are encouraged. While accepting submissions from all over the world, the journal particularly emphasizes European topics relevant to epidemiology. The published articles consist of empirical research findings, developments in methodology, and opinion pieces.