Heterogeneity in Effects of Automated Results Feedback After Online Depression Screening: Secondary Machine-Learning Based Analysis of the DISCOVER Trial.
Matthias Klee, Byron C Jaeger, Franziska Sikorski, Bernd Löwe, Sebastian Kohlmann
{"title":"Heterogeneity in Effects of Automated Results Feedback After Online Depression Screening: Secondary Machine-Learning Based Analysis of the DISCOVER Trial.","authors":"Matthias Klee, Byron C Jaeger, Franziska Sikorski, Bernd Löwe, Sebastian Kohlmann","doi":"10.2196/70001","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Online depression screening tools may increase uptake of evidence-based care and consequently lead to symptom reduction. However, results of the DISCOVER trial suggested no effect of automated results feedback compared with no feedback after online depression screening on depressive symptom reduction six months after screening. Interpersonal variation in symptom representation, health care needs, and treatment preferences may nonetheless have led to differential response to feedback mode on an individual level.</p><p><strong>Objective: </strong>The aim of this study was to examine heterogeneity of treatment effects (HTE), that is, differential responses to two feedback modes (tailored or nontailored) versus no feedback (control) following online depression screening.</p><p><strong>Methods: </strong>We used causal forests, a machine learning method that applies recursive partitioning to estimate conditional average treatment effects (CATEs). In this secondary data analysis of the DISCOVER trial, eligible participants screened positive for at least moderate depression severity but had not been diagnosed or treated for depression in the preceding year. The primary outcome was heterogeneity in depression severity change, over a and six-month follow up period, measured with the Patient Health Questionnaire-9. Analysis comprised exploration of average treatment effects (ATE), HTE, operationalized with the area under the targeting operator characteristic curve (AUTOC), and differences in ATE when allocating feedback based on predicted CATE. We extracted top predictors of depression severity change, given feedback and explored high-CATE covariate profiles. Prior to analysis, data was split into training and test sets (1:1) to minimize the risk of overfitting and evaluate predictions in held-out test data.</p><p><strong>Results: </strong>Data from 946 participants of the DISCOVER trial without missing data were analyzed. We did not detect HTE; no versus nontailored feedback, AUTOC -0.48 (95% CI -1.62 to 0.67, P=.41); no versus tailored feedback, AUTOC 0.06 (95% CI -1.21 to 1.33, P=.93); and no versus any feedback, AUTOC -0.20 (95% CI -1.30 to 0.89, P=.72). There was no evidence of alteration to the ATE in the test set when allocating feedback (tailored or nontailored) based on the predicted CATE. By examining covariate profiles, we observed a potentially detrimental role of control beliefs, given feedback compared with no feedback.</p><p><strong>Conclusions: </strong>We applied causal forests to describe higher-level interactions among a broad range of predictors to detect HTE. In absence of evidence for HTE, treatment prioritization based on trained models did not improve ATEs. We did not find evidence of harm or benefit from providing tailored or nontailored feedback after online depression screening regarding depression severity change after six months. Future studies may test whether screening alone prompts behavioral activation and downstream depression severity reduction, considering the observed uniform changes across groups.</p>","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"4 ","pages":"e70001"},"PeriodicalIF":2.0000,"publicationDate":"2025-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12375799/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JMIR AI","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2196/70001","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Online depression screening tools may increase uptake of evidence-based care and consequently lead to symptom reduction. However, results of the DISCOVER trial suggested no effect of automated results feedback compared with no feedback after online depression screening on depressive symptom reduction six months after screening. Interpersonal variation in symptom representation, health care needs, and treatment preferences may nonetheless have led to differential response to feedback mode on an individual level.
Objective: The aim of this study was to examine heterogeneity of treatment effects (HTE), that is, differential responses to two feedback modes (tailored or nontailored) versus no feedback (control) following online depression screening.
Methods: We used causal forests, a machine learning method that applies recursive partitioning to estimate conditional average treatment effects (CATEs). In this secondary data analysis of the DISCOVER trial, eligible participants screened positive for at least moderate depression severity but had not been diagnosed or treated for depression in the preceding year. The primary outcome was heterogeneity in depression severity change, over a and six-month follow up period, measured with the Patient Health Questionnaire-9. Analysis comprised exploration of average treatment effects (ATE), HTE, operationalized with the area under the targeting operator characteristic curve (AUTOC), and differences in ATE when allocating feedback based on predicted CATE. We extracted top predictors of depression severity change, given feedback and explored high-CATE covariate profiles. Prior to analysis, data was split into training and test sets (1:1) to minimize the risk of overfitting and evaluate predictions in held-out test data.
Results: Data from 946 participants of the DISCOVER trial without missing data were analyzed. We did not detect HTE; no versus nontailored feedback, AUTOC -0.48 (95% CI -1.62 to 0.67, P=.41); no versus tailored feedback, AUTOC 0.06 (95% CI -1.21 to 1.33, P=.93); and no versus any feedback, AUTOC -0.20 (95% CI -1.30 to 0.89, P=.72). There was no evidence of alteration to the ATE in the test set when allocating feedback (tailored or nontailored) based on the predicted CATE. By examining covariate profiles, we observed a potentially detrimental role of control beliefs, given feedback compared with no feedback.
Conclusions: We applied causal forests to describe higher-level interactions among a broad range of predictors to detect HTE. In absence of evidence for HTE, treatment prioritization based on trained models did not improve ATEs. We did not find evidence of harm or benefit from providing tailored or nontailored feedback after online depression screening regarding depression severity change after six months. Future studies may test whether screening alone prompts behavioral activation and downstream depression severity reduction, considering the observed uniform changes across groups.