Leveraging Machine Learning to Develop Digital Engagement Phenotypes of Users in a Digital Diabetes Prevention Program: Evaluation Study.

JMIR AI Pub Date : 2024-03-01 DOI:10.2196/47122
Danissa V Rodriguez, Ji Chen, Ratnalekha V N Viswanadham, Katharine Lawrence, Devin Mann
{"title":"Leveraging Machine Learning to Develop Digital Engagement Phenotypes of Users in a Digital Diabetes Prevention Program: Evaluation Study.","authors":"Danissa V Rodriguez, Ji Chen, Ratnalekha V N Viswanadham, Katharine Lawrence, Devin Mann","doi":"10.2196/47122","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Digital diabetes prevention programs (dDPPs) are effective \"digital prescriptions\" but have high attrition rates and program noncompletion. To address this, we developed a personalized automatic messaging system (PAMS) that leverages SMS text messaging and data integration into clinical workflows to increase dDPP engagement via enhanced patient-provider communication. Preliminary data showed positive results. However, further investigation is needed to determine how to optimize the tailoring of support technology such as PAMS based on a user's preferences to boost their dDPP engagement.</p><p><strong>Objective: </strong>This study evaluates leveraging machine learning (ML) to develop digital engagement phenotypes of dDPP users and assess ML's accuracy in predicting engagement with dDPP activities. This research will be used in a PAMS optimization process to improve PAMS personalization by incorporating engagement prediction and digital phenotyping. This study aims (1) to prove the feasibility of using dDPP user-collected data to build an ML model that predicts engagement and contributes to identifying digital engagement phenotypes, (2) to describe methods for developing ML models with dDPP data sets and present preliminary results, and (3) to present preliminary data on user profiling based on ML model outputs.</p><p><strong>Methods: </strong>Using the gradient-boosted forest model, we predicted engagement in 4 dDPP individual activities (physical activity, lessons, social activity, and weigh-ins) and general activity (engagement in any activity) based on previous short- and long-term activity in the app. The area under the receiver operating characteristic curve, the area under the precision-recall curve, and the Brier score metrics determined the performance of the model. Shapley values reflected the feature importance of the models and determined what variables informed user profiling through latent profile analysis.</p><p><strong>Results: </strong>We developed 2 models using weekly and daily DPP data sets (328,821 and 704,242 records, respectively), which yielded predictive accuracies above 90%. Although both models were highly accurate, the daily model better fitted our research plan because it predicted daily changes in individual activities, which was crucial for creating the \"digital phenotypes.\" To better understand the variables contributing to the model predictor, we calculated the Shapley values for both models to identify the features with the highest contribution to model fit; engagement with any activity in the dDPP in the last 7 days had the most predictive power. We profiled users with latent profile analysis after 2 weeks of engagement (Bayesian information criterion=-3222.46) with the dDPP and identified 6 profiles of users, including those with high engagement, minimal engagement, and attrition.</p><p><strong>Conclusions: </strong>Preliminary results demonstrate that applying ML methods with predicting power is an acceptable mechanism to tailor and optimize messaging interventions to support patient engagement and adherence to digital prescriptions. The results enable future optimization of our existing messaging platform and expansion of this methodology to other clinical domains.</p><p><strong>Trial registration: </strong>ClinicalTrials.gov NCT04773834; https://www.clinicaltrials.gov/ct2/show/NCT04773834.</p><p><strong>International registered report identifier (irrid): </strong>RR2-10.2196/26750.</p>","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"3 ","pages":"e47122"},"PeriodicalIF":0.0000,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11041485/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JMIR AI","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2196/47122","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Digital diabetes prevention programs (dDPPs) are effective "digital prescriptions" but have high attrition rates and program noncompletion. To address this, we developed a personalized automatic messaging system (PAMS) that leverages SMS text messaging and data integration into clinical workflows to increase dDPP engagement via enhanced patient-provider communication. Preliminary data showed positive results. However, further investigation is needed to determine how to optimize the tailoring of support technology such as PAMS based on a user's preferences to boost their dDPP engagement.

Objective: This study evaluates leveraging machine learning (ML) to develop digital engagement phenotypes of dDPP users and assess ML's accuracy in predicting engagement with dDPP activities. This research will be used in a PAMS optimization process to improve PAMS personalization by incorporating engagement prediction and digital phenotyping. This study aims (1) to prove the feasibility of using dDPP user-collected data to build an ML model that predicts engagement and contributes to identifying digital engagement phenotypes, (2) to describe methods for developing ML models with dDPP data sets and present preliminary results, and (3) to present preliminary data on user profiling based on ML model outputs.

Methods: Using the gradient-boosted forest model, we predicted engagement in 4 dDPP individual activities (physical activity, lessons, social activity, and weigh-ins) and general activity (engagement in any activity) based on previous short- and long-term activity in the app. The area under the receiver operating characteristic curve, the area under the precision-recall curve, and the Brier score metrics determined the performance of the model. Shapley values reflected the feature importance of the models and determined what variables informed user profiling through latent profile analysis.

Results: We developed 2 models using weekly and daily DPP data sets (328,821 and 704,242 records, respectively), which yielded predictive accuracies above 90%. Although both models were highly accurate, the daily model better fitted our research plan because it predicted daily changes in individual activities, which was crucial for creating the "digital phenotypes." To better understand the variables contributing to the model predictor, we calculated the Shapley values for both models to identify the features with the highest contribution to model fit; engagement with any activity in the dDPP in the last 7 days had the most predictive power. We profiled users with latent profile analysis after 2 weeks of engagement (Bayesian information criterion=-3222.46) with the dDPP and identified 6 profiles of users, including those with high engagement, minimal engagement, and attrition.

Conclusions: Preliminary results demonstrate that applying ML methods with predicting power is an acceptable mechanism to tailor and optimize messaging interventions to support patient engagement and adherence to digital prescriptions. The results enable future optimization of our existing messaging platform and expansion of this methodology to other clinical domains.

Trial registration: ClinicalTrials.gov NCT04773834; https://www.clinicaltrials.gov/ct2/show/NCT04773834.

International registered report identifier (irrid): RR2-10.2196/26750.

利用机器学习开发数字糖尿病预防计划用户的数字参与表型:评估研究。
背景:数字糖尿病预防计划(dDPPs)是一种有效的 "数字处方",但流失率和计划未完成率较高。为解决这一问题,我们开发了个性化自动短信系统(PAMS),该系统利用短信和数据集成到临床工作流程中,通过加强患者与医生之间的沟通来提高糖尿病预防计划的参与度。初步数据显示效果良好。然而,如何根据用户的偏好优化定制 PAMS 等支持技术,以提高他们的 dDPP 参与度,还需要进一步研究:本研究评估了利用机器学习(ML)开发 dDPP 用户数字参与表型的情况,并评估了 ML 在预测 dDPP 活动参与度方面的准确性。这项研究将用于 PAMS 优化过程,通过结合参与度预测和数字表型来改进 PAMS 个性化。本研究旨在:(1)证明使用 dDPP 用户收集的数据建立一个预测参与度并有助于识别数字参与度表型的 ML 模型的可行性;(2)介绍使用 dDPP 数据集开发 ML 模型的方法并展示初步结果;(3)展示基于 ML 模型输出的用户剖析的初步数据:使用梯度增强森林模型,我们根据用户以前在应用程序中的短期和长期活动,预测了用户参与 4 项 dDPP 个人活动(体育活动、课程、社交活动和称重)和一般活动(参与任何活动)的情况。接收者操作特征曲线下面积、精确度-调用曲线下面积和布赖尔评分指标决定了模型的性能。Shapley 值反映了模型的特征重要性,并通过潜在特征分析确定了用户特征描述所依据的变量:我们使用每周和每天的 DPP 数据集(分别为 328,821 条和 704,242 条记录)开发了两个模型,预测准确率均超过 90%。虽然两个模型的准确率都很高,但每日模型更符合我们的研究计划,因为它能预测个人活动的每日变化,而这对创建 "数字表型 "至关重要。为了更好地了解对模型预测有贡献的变量,我们计算了两个模型的 Shapley 值,以确定对模型拟合贡献最大的特征;过去 7 天内参与 dDPP 中任何活动的预测能力最强。在用户参与 dDPP 2 周后(贝叶斯信息标准=-3222.46),我们使用潜在特征分析对用户进行了分析,并确定了 6 种用户特征,包括高参与度、最低参与度和流失:初步结果表明,应用具有预测能力的 ML 方法是一种可接受的机制,可用于定制和优化信息干预措施,以支持患者参与和坚持使用数字处方。这些结果有助于今后优化我们现有的信息平台,并将这一方法推广到其他临床领域:ClinicalTrials.gov NCT04773834;https://www.clinicaltrials.gov/ct2/show/NCT04773834.International 注册报告标识符 (irrid):RR2-10.2196/26750。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信