Machine Learning-Based Prediction of No-Show Telemedicine Encounters.

IF 1.6 Q3 HEALTH CARE SCIENCES & SERVICES
Telemedicine reports Pub Date : 2025-04-07 eCollection Date: 2025-01-01 DOI:10.1089/tmr.2025.0009
C Mahony Reategui-Rivera, Wanting Cui, Stefan Escobar-Agreda, Leonardo Rojas-Mezarina, Joseph Finkelstein
{"title":"Machine Learning-Based Prediction of No-Show Telemedicine Encounters.","authors":"C Mahony Reategui-Rivera, Wanting Cui, Stefan Escobar-Agreda, Leonardo Rojas-Mezarina, Joseph Finkelstein","doi":"10.1089/tmr.2025.0009","DOIUrl":null,"url":null,"abstract":"<p><strong>Aim: </strong>This study aimed to evaluate the performance of machine learning (ML) models in predicting patient no-shows for telemedicine appointments within Peruvian health system and identify key predictors of nonattendance.</p><p><strong>Methods: </strong>We performed a retrospective observational study using anonymized data (June 2019-November 2023) from \"Teleatiendo.\" The dataset included over 1.5 million completed appointments and about 64,000 no-shows (4.1%), focusing on teleorientation and telemonitoring. Predictor variables included patient demographics, socioeconomic factors, health care facility characteristics, appointment timing, and telemedicine service types. A 70% training, 10% validation, and 20% testing split were used over 10 iterations, with hyperparameter tuning performed on the validation set to identify optimal model parameters. Multiple ML approaches-random forest, XGBoost, LightGBM, and anomaly detection-were implemented in combination with undersampling and cost-sensitive learning to address class imbalance. Performance was evaluated using precision, recall, specificity, area under the curve (AUC), F1-score, and accuracy.</p><p><strong>Results: </strong>Of the models tested, undersampling with XGBoost achieved a precision of 0.115 (±0.001), recall of 0.654 (±0.005), specificity of 0.786 (±0.002), AUC of 0.720 (±0.002), and accuracy of 0.780 (±0.002). In contrast, cost-sensitive XGBoost exhibited a balanced performance with a precision of 0.123 (±0.001), recall of 0.639 (±0.006), specificity of 0.805 (±0.004), AUC of 0.722 (±0.001), and accuracy of 0.799 (±0.003). Additionally, cost-sensitive random forest achieved the highest specificity (0.843 ± 0.002) and accuracy (0.832 ± 0.001) but recorded a lower recall (0.585 ± 0.004), while cost-sensitive LightGBM and balanced random forest yielded performance metrics similar to cost-sensitive XGBoost. Isolation forest, used for abnormality detection, demonstrated the lowest performance.</p><p><strong>Conclusions: </strong>ML models can moderately predict telemedicine no-shows in Peru, with cost-sensitive boosting techniques enhancing the identification of high-risk patients. Key predictors reflect both individual behavior and system-level contexts, suggesting the need for tailored, context-specific interventions. These findings can inform targeted strategies to optimize telemedicine, improve appointment adherence, and promote equitable health care access.</p>","PeriodicalId":94218,"journal":{"name":"Telemedicine reports","volume":"6 1","pages":"109-119"},"PeriodicalIF":1.6000,"publicationDate":"2025-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12235123/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Telemedicine reports","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1089/tmr.2025.0009","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q3","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0

Abstract

Aim: This study aimed to evaluate the performance of machine learning (ML) models in predicting patient no-shows for telemedicine appointments within Peruvian health system and identify key predictors of nonattendance.

Methods: We performed a retrospective observational study using anonymized data (June 2019-November 2023) from "Teleatiendo." The dataset included over 1.5 million completed appointments and about 64,000 no-shows (4.1%), focusing on teleorientation and telemonitoring. Predictor variables included patient demographics, socioeconomic factors, health care facility characteristics, appointment timing, and telemedicine service types. A 70% training, 10% validation, and 20% testing split were used over 10 iterations, with hyperparameter tuning performed on the validation set to identify optimal model parameters. Multiple ML approaches-random forest, XGBoost, LightGBM, and anomaly detection-were implemented in combination with undersampling and cost-sensitive learning to address class imbalance. Performance was evaluated using precision, recall, specificity, area under the curve (AUC), F1-score, and accuracy.

Results: Of the models tested, undersampling with XGBoost achieved a precision of 0.115 (±0.001), recall of 0.654 (±0.005), specificity of 0.786 (±0.002), AUC of 0.720 (±0.002), and accuracy of 0.780 (±0.002). In contrast, cost-sensitive XGBoost exhibited a balanced performance with a precision of 0.123 (±0.001), recall of 0.639 (±0.006), specificity of 0.805 (±0.004), AUC of 0.722 (±0.001), and accuracy of 0.799 (±0.003). Additionally, cost-sensitive random forest achieved the highest specificity (0.843 ± 0.002) and accuracy (0.832 ± 0.001) but recorded a lower recall (0.585 ± 0.004), while cost-sensitive LightGBM and balanced random forest yielded performance metrics similar to cost-sensitive XGBoost. Isolation forest, used for abnormality detection, demonstrated the lowest performance.

Conclusions: ML models can moderately predict telemedicine no-shows in Peru, with cost-sensitive boosting techniques enhancing the identification of high-risk patients. Key predictors reflect both individual behavior and system-level contexts, suggesting the need for tailored, context-specific interventions. These findings can inform targeted strategies to optimize telemedicine, improve appointment adherence, and promote equitable health care access.

Abstract Image

Abstract Image

Abstract Image

基于机器学习的缺席远程医疗预测。
目的:本研究旨在评估机器学习(ML)模型在预测秘鲁卫生系统内远程医疗预约患者缺勤方面的性能,并确定缺勤的关键预测因素。方法:我们使用来自Teleatiendo的匿名数据(2019年6月- 2023年11月)进行了一项回顾性观察研究。该数据集包括超过150万次已完成的预约和约64,000次未赴约(4.1%),重点是远程定向和远程监控。预测变量包括患者人口统计、社会经济因素、医疗机构特征、预约时间和远程医疗服务类型。在10次迭代中使用了70%的训练,10%的验证和20%的测试分割,并在验证集上执行超参数调优以确定最佳模型参数。多种机器学习方法——随机森林、XGBoost、LightGBM和异常检测——结合欠采样和成本敏感学习来解决类不平衡问题。使用精密度、召回率、特异性、曲线下面积(AUC)、f1评分和准确性来评估其性能。结果:在测试的模型中,XGBoost的欠采样精度为0.115(±0.001),召回率为0.654(±0.005),特异性为0.786(±0.002),AUC为0.720(±0.002),准确度为0.780(±0.002)。相比之下,成本敏感的XGBoost表现出平衡的性能,精密度为0.123(±0.001),召回率为0.639(±0.006),特异性为0.805(±0.004),AUC为0.722(±0.001),准确度为0.799(±0.003)。此外,成本敏感随机森林获得了最高的特异性(0.843±0.002)和准确性(0.832±0.001),但召回率较低(0.585±0.004),而成本敏感的LightGBM和平衡随机森林的性能指标与成本敏感的XGBoost相似。用于异常检测的隔离林表现出最低的性能。结论:机器学习模型可以适度预测秘鲁远程医疗的缺席,成本敏感的促进技术增强了高风险患者的识别。关键的预测因素反映了个人行为和系统层面的情况,这表明需要量身定制的、针对具体情况的干预措施。这些发现可以为有针对性的策略提供信息,以优化远程医疗,提高预约依从性,并促进公平的卫生保健获取。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
1.80
自引率
0.00%
发文量
0
审稿时长
8 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信