减轻普通住院急性肾损伤机器学习预测模型漂移的策略

IF 3.7 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

International Journal of Intelligent Systems Pub Date : 2025-02-24 DOI:10.1155/int/2240862

Jie Xu, Heng Liu, Guisen Li, Wenjun Mi, Martin Gallagher, Yunlin Feng

{"title":"减轻普通住院急性肾损伤机器学习预测模型漂移的策略","authors":"Jie Xu, Heng Liu, Guisen Li, Wenjun Mi, Martin Gallagher, Yunlin Feng","doi":"10.1155/int/2240862","DOIUrl":null,"url":null,"abstract":"<div>\n Background: Model drift is a major challenge for applications of clinical prediction models. We aimed to investigate the effect of two strategies to mitigate model drift based on a previously reported prediction model for acute kidney injury (AKI).\n Methods: Deidentified electronic medical data of inpatients in Sichuan Provincial People’s Hospital from January 1, 2019, to December 31, 2022, were collected. AKI was defined by the KDIGO criteria. The top 50 laboratory variables, alongside with sex, age, and the top 20 prescribed medicines were included as predictive variables. In model optimization, the convolution neural network module was replaced by a self-attention module. Periodical refitting with accumulative data was also conducted before temporally external validations. The performance of the innovated model (ATRN) was compared with the previous model (ATCN) and other four models.\n Results: A total of 150,373 admissions were identified. The annual incidences of AKI varied between 5.57% and 5.8%. The performance of the models which had used temporal features profoundly declined over time. The ATRN model with module more suitable to capture short-term time dependencies outperformed the other five models both in C-statistics and recall rates perspectives. Periodic refitting the prediction model with accumulative data also helped to effectively mitigate the model drift, especially in models with time series data.\n Conclusions: Enhancing the model’s ability to capture short-term time dependencies in time series data and periodic refitting with accumulative data were both capable of mitigating the model drift. The best improvement of model performance was observed in the combination of these two strategies.\n </div>","PeriodicalId":14089,"journal":{"name":"International Journal of Intelligent Systems","volume":"2025 1","pages":""},"PeriodicalIF":3.7000,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1155/int/2240862","citationCount":"0","resultStr":"{\"title\":\"Strategies to Mitigate Model Drift of a Machine Learning Prediction Model for Acute Kidney Injury in General Hospitalization\",\"authors\":\"Jie Xu, Heng Liu, Guisen Li, Wenjun Mi, Martin Gallagher, Yunlin Feng\",\"doi\":\"10.1155/int/2240862\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n Background: Model drift is a major challenge for applications of clinical prediction models. We aimed to investigate the effect of two strategies to mitigate model drift based on a previously reported prediction model for acute kidney injury (AKI).\\n Methods: Deidentified electronic medical data of inpatients in Sichuan Provincial People’s Hospital from January 1, 2019, to December 31, 2022, were collected. AKI was defined by the KDIGO criteria. The top 50 laboratory variables, alongside with sex, age, and the top 20 prescribed medicines were included as predictive variables. In model optimization, the convolution neural network module was replaced by a self-attention module. Periodical refitting with accumulative data was also conducted before temporally external validations. The performance of the innovated model (ATRN) was compared with the previous model (ATCN) and other four models.\\n Results: A total of 150,373 admissions were identified. The annual incidences of AKI varied between 5.57% and 5.8%. The performance of the models which had used temporal features profoundly declined over time. The ATRN model with module more suitable to capture short-term time dependencies outperformed the other five models both in C-statistics and recall rates perspectives. Periodic refitting the prediction model with accumulative data also helped to effectively mitigate the model drift, especially in models with time series data.\\n Conclusions: Enhancing the model’s ability to capture short-term time dependencies in time series data and periodic refitting with accumulative data were both capable of mitigating the model drift. The best improvement of model performance was observed in the combination of these two strategies.\\n </div>\",\"PeriodicalId\":14089,\"journal\":{\"name\":\"International Journal of Intelligent Systems\",\"volume\":\"2025 1\",\"pages\":\"\"},\"PeriodicalIF\":3.7000,\"publicationDate\":\"2025-02-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1155/int/2240862\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Intelligent Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1155/int/2240862\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Intelligent Systems","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1155/int/2240862","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

背景：模型漂移是临床预测模型应用的主要挑战。基于先前报道的急性肾损伤（AKI）预测模型，我们旨在研究两种策略对减轻模型漂移的影响。方法：收集2019年1月1日至2022年12月31日四川省人民医院住院患者经识别的电子医疗数据。AKI是由KDIGO标准定义的。前50个实验室变量，以及性别、年龄和前20种处方药被列为预测变量。在模型优化中，将卷积神经网络模块替换为自关注模块。在临时外部验证之前，还进行了累积数据的定期改装。将创新模型（ATRN）的性能与原有模型（ATCN）及其他四种模型进行了比较。结果：共确定150,373例入院患者。AKI的年发病率在5.57% ~ 5.8%之间。使用时间特征的模型的性能随着时间的推移而急剧下降。具有更适合捕获短期时间依赖性的模块的ATRN模型在c统计量和召回率方面都优于其他5个模型。利用累积数据对预测模型进行周期性修正也有助于有效地减轻模型漂移，特别是在时间序列数据模型中。结论：增强模型捕捉时间序列数据中的短期时间依赖性的能力，以及利用累积数据进行周期性修正，都能够减轻模型漂移。两种策略的组合对模型性能的改善效果最好。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Strategies to Mitigate Model Drift of a Machine Learning Prediction Model for Acute Kidney Injury in General Hospitalization

查看原文本刊更多论文

Strategies to Mitigate Model Drift of a Machine Learning Prediction Model for Acute Kidney Injury in General Hospitalization

Background: Model drift is a major challenge for applications of clinical prediction models. We aimed to investigate the effect of two strategies to mitigate model drift based on a previously reported prediction model for acute kidney injury (AKI).

Methods: Deidentified electronic medical data of inpatients in Sichuan Provincial People’s Hospital from January 1, 2019, to December 31, 2022, were collected. AKI was defined by the KDIGO criteria. The top 50 laboratory variables, alongside with sex, age, and the top 20 prescribed medicines were included as predictive variables. In model optimization, the convolution neural network module was replaced by a self-attention module. Periodical refitting with accumulative data was also conducted before temporally external validations. The performance of the innovated model (ATRN) was compared with the previous model (ATCN) and other four models.

Results: A total of 150,373 admissions were identified. The annual incidences of AKI varied between 5.57% and 5.8%. The performance of the models which had used temporal features profoundly declined over time. The ATRN model with module more suitable to capture short-term time dependencies outperformed the other five models both in C-statistics and recall rates perspectives. Periodic refitting the prediction model with accumulative data also helped to effectively mitigate the model drift, especially in models with time series data.

Conclusions: Enhancing the model’s ability to capture short-term time dependencies in time series data and periodic refitting with accumulative data were both capable of mitigating the model drift. The best improvement of model performance was observed in the combination of these two strategies.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Journal of Intelligent Systems 工程技术-计算机：人工智能

CiteScore

11.30

自引率

14.30%

发文量

304

审稿时长

9 months

期刊介绍： The International Journal of Intelligent Systems serves as a forum for individuals interested in tapping into the vast theories based on intelligent systems construction. With its peer-reviewed format, the journal explores several fascinating editorials written by today''s experts in the field. Because new developments are being introduced each day, there''s much to be learned — examination, analysis creation, information retrieval, man–computer interactions, and more. The International Journal of Intelligent Systems uses charts and illustrations to demonstrate these ground-breaking issues, and encourages readers to share their thoughts and experiences.