Benjamin Post MBChB , Roman Klapaukh PhD , Prof Stephen J Brett MD , Prof A Aldo Faisal PhD
{"title":"利用管理病人数据的时间模式来预测急诊住院的风险。","authors":"Benjamin Post MBChB , Roman Klapaukh PhD , Prof Stephen J Brett MD , Prof A Aldo Faisal PhD","doi":"10.1016/S2589-7500(24)00254-1","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>Unplanned hospital admissions are associated with worse patient outcomes and cause strain on health systems worldwide. Primary care electronic health records (EHRs) have successfully been used to create prediction models for emergency hospitalisation, but these approaches require a broad range of diagnostic, physiological, and laboratory values. In this study, we aimed to capture temporal patterns of patient activity from EHR data and evaluate their effectiveness in predicting emergency hospital admissions compared with conventional methods.</div></div><div><h3>Methods</h3><div>In this retrospective observational study, we used the Secure Anonymised Information Linkage databank to extract temporal patterns of primary care activity from undifferentiated electronic health record timestamp data for 1·37 million patients in Wales aged 18–80 years with at least one recorded Read code between the years 2016 and 2018. Using Gaussian mixture modelling we grouped patients into distinct temporal clusters, performed a three-stage validation of our approach and calculated the risk of emergency hospital admission for each temporal cluster group. Finally, these temporal clusters were combined with five administrative variables and incorporated into four emergency hospital admission prediction models (logistic regression, naive Bayes, XGBoost, and multilayer perceptron [MLP]) and compared with a more traditional, but data-intensive, modelling technique. The primary outcome was emergency hospital admission as the next health-care event.</div></div><div><h3>Findings</h3><div>Six distinct temporal cluster patterns of primary care EHR activity were identified, associated with varying risks of future emergency hospital admission risk. These patterns were visually interpretable, repeatable at a population-level, and clinically plausible. The best emergency hospital admission prediction model (MLP) achieved an area under the receiver operating characteristic (AUROC) of 0·82 and precision of 0·94 in regional cohorts. In external validation in regional cohorts, similar model performance was observed (AUROC 0·82 and precision 0·92). This model also matched the performance of a more complex model (extended feature model) requiring 33 clinical parameters (AUROC 0·82 <em>vs</em> 0·83; precision 0·94 <em>vs</em> 0·90) for the same task on the same dataset.</div></div><div><h3>Interpretation</h3><div>We developed a novel machine learning pipeline that extracts interpretable temporal patterns from simple representations of EHR data and can be incorporated into emergency hospital admission predictors. This framework might enable more rapid development of parsimonious clinical prediction models.</div></div><div><h3>Funding</h3><div>UKRI CDT in AI for Healthcare, UKRI Turing AI Fellowship, NIHR Imperial Biomedical Research Centre, and Research Capability Funding.</div></div>","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"7 2","pages":"Pages e124-e135"},"PeriodicalIF":23.8000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Harnessing temporal patterns in administrative patient data to predict risk of emergency hospital admission\",\"authors\":\"Benjamin Post MBChB , Roman Klapaukh PhD , Prof Stephen J Brett MD , Prof A Aldo Faisal PhD\",\"doi\":\"10.1016/S2589-7500(24)00254-1\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Background</h3><div>Unplanned hospital admissions are associated with worse patient outcomes and cause strain on health systems worldwide. Primary care electronic health records (EHRs) have successfully been used to create prediction models for emergency hospitalisation, but these approaches require a broad range of diagnostic, physiological, and laboratory values. In this study, we aimed to capture temporal patterns of patient activity from EHR data and evaluate their effectiveness in predicting emergency hospital admissions compared with conventional methods.</div></div><div><h3>Methods</h3><div>In this retrospective observational study, we used the Secure Anonymised Information Linkage databank to extract temporal patterns of primary care activity from undifferentiated electronic health record timestamp data for 1·37 million patients in Wales aged 18–80 years with at least one recorded Read code between the years 2016 and 2018. Using Gaussian mixture modelling we grouped patients into distinct temporal clusters, performed a three-stage validation of our approach and calculated the risk of emergency hospital admission for each temporal cluster group. Finally, these temporal clusters were combined with five administrative variables and incorporated into four emergency hospital admission prediction models (logistic regression, naive Bayes, XGBoost, and multilayer perceptron [MLP]) and compared with a more traditional, but data-intensive, modelling technique. The primary outcome was emergency hospital admission as the next health-care event.</div></div><div><h3>Findings</h3><div>Six distinct temporal cluster patterns of primary care EHR activity were identified, associated with varying risks of future emergency hospital admission risk. These patterns were visually interpretable, repeatable at a population-level, and clinically plausible. The best emergency hospital admission prediction model (MLP) achieved an area under the receiver operating characteristic (AUROC) of 0·82 and precision of 0·94 in regional cohorts. In external validation in regional cohorts, similar model performance was observed (AUROC 0·82 and precision 0·92). This model also matched the performance of a more complex model (extended feature model) requiring 33 clinical parameters (AUROC 0·82 <em>vs</em> 0·83; precision 0·94 <em>vs</em> 0·90) for the same task on the same dataset.</div></div><div><h3>Interpretation</h3><div>We developed a novel machine learning pipeline that extracts interpretable temporal patterns from simple representations of EHR data and can be incorporated into emergency hospital admission predictors. This framework might enable more rapid development of parsimonious clinical prediction models.</div></div><div><h3>Funding</h3><div>UKRI CDT in AI for Healthcare, UKRI Turing AI Fellowship, NIHR Imperial Biomedical Research Centre, and Research Capability Funding.</div></div>\",\"PeriodicalId\":48534,\"journal\":{\"name\":\"Lancet Digital Health\",\"volume\":\"7 2\",\"pages\":\"Pages e124-e135\"},\"PeriodicalIF\":23.8000,\"publicationDate\":\"2025-02-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Lancet Digital Health\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2589750024002541\",\"RegionNum\":1,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MEDICAL INFORMATICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Lancet Digital Health","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2589750024002541","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}
引用次数: 0
摘要
背景:计划外住院与较差的患者预后有关,并对世界各地的卫生系统造成压力。初级保健电子健康记录(EHRs)已经成功地用于创建紧急住院的预测模型,但这些方法需要广泛的诊断、生理和实验室值。在这项研究中,我们旨在从电子病历数据中捕捉患者活动的时间模式,并评估其与传统方法相比在预测急诊住院情况方面的有效性。方法:在这项回顾性观察研究中,我们使用安全匿名信息链接数据库,从2016年至2018年期间威尔士18-80岁至少有一个记录的Read代码的137万名未区分电子健康记录时间戳数据中提取初级保健活动的时间模式。使用高斯混合模型,我们将患者分为不同的时间簇,对我们的方法进行了三阶段验证,并计算了每个时间簇组的急诊住院风险。最后,将这些时间聚类与五个管理变量相结合,并将其纳入四种急诊住院预测模型(逻辑回归、朴素贝叶斯、XGBoost和多层感知器[MLP]),并与更传统的数据密集型建模技术进行比较。主要结局是作为下一个卫生保健事件的紧急住院。研究结果:确定了初级保健电子病历活动的六种不同的时间聚类模式,这些模式与未来急诊住院风险的不同风险相关。这些模式在视觉上是可解释的,在人群水平上是可重复的,并且在临床上是合理的。在区域队列中,最佳的急诊住院预测模型(MLP)的接受者工作特征下面积(AUROC)为0.82,精度为0.94。在区域队列的外部验证中,观察到类似的模型性能(AUROC为0.82,精度为0.92)。该模型的性能也与需要33个临床参数的更复杂的模型(扩展特征模型)相匹配(AUROC为0.82 vs 0.83;精度0.94 vs 0.90),对于相同的任务,在相同的数据集。解释:我们开发了一种新的机器学习管道,可以从电子病历数据的简单表示中提取可解释的时间模式,并可以纳入急诊住院预测。这一框架可能使简化的临床预测模型能够更快地发展。资助:UKRI医疗保健人工智能CDT, UKRI图灵人工智能奖学金,NIHR帝国生物医学研究中心和研究能力基金。
Harnessing temporal patterns in administrative patient data to predict risk of emergency hospital admission
Background
Unplanned hospital admissions are associated with worse patient outcomes and cause strain on health systems worldwide. Primary care electronic health records (EHRs) have successfully been used to create prediction models for emergency hospitalisation, but these approaches require a broad range of diagnostic, physiological, and laboratory values. In this study, we aimed to capture temporal patterns of patient activity from EHR data and evaluate their effectiveness in predicting emergency hospital admissions compared with conventional methods.
Methods
In this retrospective observational study, we used the Secure Anonymised Information Linkage databank to extract temporal patterns of primary care activity from undifferentiated electronic health record timestamp data for 1·37 million patients in Wales aged 18–80 years with at least one recorded Read code between the years 2016 and 2018. Using Gaussian mixture modelling we grouped patients into distinct temporal clusters, performed a three-stage validation of our approach and calculated the risk of emergency hospital admission for each temporal cluster group. Finally, these temporal clusters were combined with five administrative variables and incorporated into four emergency hospital admission prediction models (logistic regression, naive Bayes, XGBoost, and multilayer perceptron [MLP]) and compared with a more traditional, but data-intensive, modelling technique. The primary outcome was emergency hospital admission as the next health-care event.
Findings
Six distinct temporal cluster patterns of primary care EHR activity were identified, associated with varying risks of future emergency hospital admission risk. These patterns were visually interpretable, repeatable at a population-level, and clinically plausible. The best emergency hospital admission prediction model (MLP) achieved an area under the receiver operating characteristic (AUROC) of 0·82 and precision of 0·94 in regional cohorts. In external validation in regional cohorts, similar model performance was observed (AUROC 0·82 and precision 0·92). This model also matched the performance of a more complex model (extended feature model) requiring 33 clinical parameters (AUROC 0·82 vs 0·83; precision 0·94 vs 0·90) for the same task on the same dataset.
Interpretation
We developed a novel machine learning pipeline that extracts interpretable temporal patterns from simple representations of EHR data and can be incorporated into emergency hospital admission predictors. This framework might enable more rapid development of parsimonious clinical prediction models.
Funding
UKRI CDT in AI for Healthcare, UKRI Turing AI Fellowship, NIHR Imperial Biomedical Research Centre, and Research Capability Funding.
期刊介绍:
The Lancet Digital Health publishes important, innovative, and practice-changing research on any topic connected with digital technology in clinical medicine, public health, and global health.
The journal’s open access content crosses subject boundaries, building bridges between health professionals and researchers.By bringing together the most important advances in this multidisciplinary field,The Lancet Digital Health is the most prominent publishing venue in digital health.
We publish a range of content types including Articles,Review, Comment, and Correspondence, contributing to promoting digital technologies in health practice worldwide.