Machine Learning–Based Asthma Attack Prediction Models From Routinely Collected Electronic Health Records: Systematic Scoping Review

JMIR AI Pub Date : 2023-12-07 DOI:10.2196/46717
Arif Budiarto, K. C. Tsang, Andrew M Wilson, Aziz Sheikh, Syed Ahmar Shah
{"title":"Machine Learning–Based Asthma Attack Prediction Models From Routinely Collected Electronic Health Records: Systematic Scoping Review","authors":"Arif Budiarto, K. C. Tsang, Andrew M Wilson, Aziz Sheikh, Syed Ahmar Shah","doi":"10.2196/46717","DOIUrl":null,"url":null,"abstract":"\n \n An early warning tool to predict attacks could enhance asthma management and reduce the likelihood of serious consequences. Electronic health records (EHRs) providing access to historical data about patients with asthma coupled with machine learning (ML) provide an opportunity to develop such a tool. Several studies have developed ML-based tools to predict asthma attacks.\n \n \n \n This study aims to critically evaluate ML-based models derived using EHRs for the prediction of asthma attacks.\n \n \n \n We systematically searched PubMed and Scopus (the search period was between January 1, 2012, and January 31, 2023) for papers meeting the following inclusion criteria: (1) used EHR data as the main data source, (2) used asthma attack as the outcome, and (3) compared ML-based prediction models’ performance. We excluded non-English papers and nonresearch papers, such as commentary and systematic review papers. In addition, we also excluded papers that did not provide any details about the respective ML approach and its result, including protocol papers. The selected studies were then summarized across multiple dimensions including data preprocessing methods, ML algorithms, model validation, model explainability, and model implementation.\n \n \n \n Overall, 17 papers were included at the end of the selection process. There was considerable heterogeneity in how asthma attacks were defined. Of the 17 studies, 8 (47%) studies used routinely collected data both from primary care and secondary care practices together. Extreme imbalanced data was a notable issue in most studies (13/17, 76%), but only 38% (5/13) of them explicitly dealt with it in their data preprocessing pipeline. The gradient boosting–based method was the best ML method in 59% (10/17) of the studies. Of the 17 studies, 14 (82%) studies used a model explanation method to identify the most important predictors. None of the studies followed the standard reporting guidelines, and none were prospectively validated.\n \n \n \n Our review indicates that this research field is still underdeveloped, given the limited body of evidence, heterogeneity of methods, lack of external validation, and suboptimally reported models. We highlighted several technical challenges (class imbalance, external validation, model explanation, and adherence to reporting guidelines to aid reproducibility) that need to be addressed to make progress toward clinical adoption.\n","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"17 8","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JMIR AI","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2196/46717","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

An early warning tool to predict attacks could enhance asthma management and reduce the likelihood of serious consequences. Electronic health records (EHRs) providing access to historical data about patients with asthma coupled with machine learning (ML) provide an opportunity to develop such a tool. Several studies have developed ML-based tools to predict asthma attacks. This study aims to critically evaluate ML-based models derived using EHRs for the prediction of asthma attacks. We systematically searched PubMed and Scopus (the search period was between January 1, 2012, and January 31, 2023) for papers meeting the following inclusion criteria: (1) used EHR data as the main data source, (2) used asthma attack as the outcome, and (3) compared ML-based prediction models’ performance. We excluded non-English papers and nonresearch papers, such as commentary and systematic review papers. In addition, we also excluded papers that did not provide any details about the respective ML approach and its result, including protocol papers. The selected studies were then summarized across multiple dimensions including data preprocessing methods, ML algorithms, model validation, model explainability, and model implementation. Overall, 17 papers were included at the end of the selection process. There was considerable heterogeneity in how asthma attacks were defined. Of the 17 studies, 8 (47%) studies used routinely collected data both from primary care and secondary care practices together. Extreme imbalanced data was a notable issue in most studies (13/17, 76%), but only 38% (5/13) of them explicitly dealt with it in their data preprocessing pipeline. The gradient boosting–based method was the best ML method in 59% (10/17) of the studies. Of the 17 studies, 14 (82%) studies used a model explanation method to identify the most important predictors. None of the studies followed the standard reporting guidelines, and none were prospectively validated. Our review indicates that this research field is still underdeveloped, given the limited body of evidence, heterogeneity of methods, lack of external validation, and suboptimally reported models. We highlighted several technical challenges (class imbalance, external validation, model explanation, and adherence to reporting guidelines to aid reproducibility) that need to be addressed to make progress toward clinical adoption.
基于机器学习的哮喘发作预测模型来自常规收集的电子健康记录:系统性范围审查
预测哮喘发作的早期预警工具可以加强哮喘管理,减少发生严重后果的可能性。电子健康记录(EHRs)提供了对哮喘患者历史数据的访问,加上机器学习(ML),为开发此类工具提供了机会。一些研究开发了基于机器学习的工具来预测哮喘发作。本研究旨在批判性地评估基于ml的模型,这些模型使用电子病历来预测哮喘发作。我们系统检索PubMed和Scopus(检索期为2012年1月1日至2023年1月31日),寻找符合以下纳入标准的论文:(1)以电子病历数据为主要数据源,(2)以哮喘发作为结局,(3)比较基于ml的预测模型的性能。我们排除了非英文论文和非研究论文,如评论和系统综述论文。此外,我们还排除了未提供有关各自ML方法及其结果的任何详细信息的论文,包括协议论文。然后从多个维度对所选研究进行总结,包括数据预处理方法、ML算法、模型验证、模型可解释性和模型实现。在评选过程结束时,总共有17篇论文入选。在如何定义哮喘发作方面存在相当大的异质性。在这17项研究中,8项(47%)研究同时从初级保健和二级保健实践中常规收集数据。在大多数研究中(13/17,76%),数据极度不平衡是一个值得注意的问题,但只有38%(5/13)的研究在数据预处理管道中明确处理了这个问题。59%(10/17)的研究中,梯度增强法是最佳的ML方法。在17项研究中,14项(82%)研究使用模型解释方法来确定最重要的预测因子。没有一项研究遵循标准报告准则,也没有一项研究得到前瞻性验证。我们的综述表明,由于证据有限,方法异质性,缺乏外部验证,以及报告的模型不够理想,这一研究领域仍然不发达。我们强调了需要解决的几个技术挑战(类别不平衡、外部验证、模型解释和遵守报告指南以帮助可重复性),以使临床采用取得进展。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信