{"title":"A comparison of techniques for predicting telehealth visit failure","authors":"Alexander J. Idarraga , David F. Schneider","doi":"10.1016/j.ibmed.2025.100235","DOIUrl":null,"url":null,"abstract":"<div><h3>Objective</h3><div>Telehealth is an increasingly important method for delivering care. Health systems lack the ability to accurately predict which telehealth visits will fail due to poor connection, poor technical literacy, or other reasons. This results in wasted resources and disrupted patient care. The purpose of this study is to characterize and compare various methods for predicting telehealth visit failure, and to determine the prediction method most suited for implementation in a real-time operational setting.</div></div><div><h3>Methods</h3><div>A single-center, retrospective cohort study was conducted using data sourced from our data warehouse. Patient demographic information and data characterizing prior visit success and engagement with electronic health tools were included. Three main model types were evaluated: an existing scoring model developed by Hughes et al., a regression-based scoring model, and Machine Learning classifiers. Variables were selected for their importance and anticipated availability; Number Needed to Treat was used to demonstrate the number of interventions (e.g. pre-visit phone calls) required to improve success rates in the context of weekly patient volumes.</div></div><div><h3>Results</h3><div>217, 229 visits spanning 480 days were evaluated, of which 22,443 (10.33 %) met criteria for failure. Hughes et al.’s model applied to our data yielded an Area Under the Receiver Operating Characteristics Curve (AUC ROC) of 0.678 when predicting failure. A score-based model achieved an AUC ROC of 0.698. Logistic Regression, Random Forest, and Gradient Boosting models demonstrated AUC ROCs ranging from 0.7877 to 0.7969. A NNT of 32 was achieved if the 263 highest-risk patients were selected in a low-volume week using the RF classifier, compared to an expected NNT of 90 if the same number of patients were randomly selected.</div></div><div><h3>Conclusions</h3><div>Machine Learning classifiers demonstrated superiority over score-based methods for predicting telehealth visit failure. Prospective evaluation is required; evaluation using NNT as a metric can help to operationalize these models.</div></div>","PeriodicalId":73399,"journal":{"name":"Intelligence-based medicine","volume":"11 ","pages":"Article 100235"},"PeriodicalIF":0.0000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Intelligence-based medicine","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666521225000390","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Objective
Telehealth is an increasingly important method for delivering care. Health systems lack the ability to accurately predict which telehealth visits will fail due to poor connection, poor technical literacy, or other reasons. This results in wasted resources and disrupted patient care. The purpose of this study is to characterize and compare various methods for predicting telehealth visit failure, and to determine the prediction method most suited for implementation in a real-time operational setting.
Methods
A single-center, retrospective cohort study was conducted using data sourced from our data warehouse. Patient demographic information and data characterizing prior visit success and engagement with electronic health tools were included. Three main model types were evaluated: an existing scoring model developed by Hughes et al., a regression-based scoring model, and Machine Learning classifiers. Variables were selected for their importance and anticipated availability; Number Needed to Treat was used to demonstrate the number of interventions (e.g. pre-visit phone calls) required to improve success rates in the context of weekly patient volumes.
Results
217, 229 visits spanning 480 days were evaluated, of which 22,443 (10.33 %) met criteria for failure. Hughes et al.’s model applied to our data yielded an Area Under the Receiver Operating Characteristics Curve (AUC ROC) of 0.678 when predicting failure. A score-based model achieved an AUC ROC of 0.698. Logistic Regression, Random Forest, and Gradient Boosting models demonstrated AUC ROCs ranging from 0.7877 to 0.7969. A NNT of 32 was achieved if the 263 highest-risk patients were selected in a low-volume week using the RF classifier, compared to an expected NNT of 90 if the same number of patients were randomly selected.
Conclusions
Machine Learning classifiers demonstrated superiority over score-based methods for predicting telehealth visit failure. Prospective evaluation is required; evaluation using NNT as a metric can help to operationalize these models.