Jie Xu, Heng Liu, Guisen Li, Wenjun Mi, Martin Gallagher, Yunlin Feng
{"title":"Strategies to Mitigate Model Drift of a Machine Learning Prediction Model for Acute Kidney Injury in General Hospitalization","authors":"Jie Xu, Heng Liu, Guisen Li, Wenjun Mi, Martin Gallagher, Yunlin Feng","doi":"10.1155/int/2240862","DOIUrl":null,"url":null,"abstract":"<div>\n <p><b>Background:</b> Model drift is a major challenge for applications of clinical prediction models. We aimed to investigate the effect of two strategies to mitigate model drift based on a previously reported prediction model for acute kidney injury (AKI).</p>\n <p><b>Methods:</b> Deidentified electronic medical data of inpatients in Sichuan Provincial People’s Hospital from January 1, 2019, to December 31, 2022, were collected. AKI was defined by the KDIGO criteria. The top 50 laboratory variables, alongside with sex, age, and the top 20 prescribed medicines were included as predictive variables. In model optimization, the convolution neural network module was replaced by a self-attention module. Periodical refitting with accumulative data was also conducted before temporally external validations. The performance of the innovated model (ATRN) was compared with the previous model (ATCN) and other four models.</p>\n <p><b>Results:</b> A total of 150,373 admissions were identified. The annual incidences of AKI varied between 5.57% and 5.8%. The performance of the models which had used temporal features profoundly declined over time. The ATRN model with module more suitable to capture short-term time dependencies outperformed the other five models both in C-statistics and recall rates perspectives. Periodic refitting the prediction model with accumulative data also helped to effectively mitigate the model drift, especially in models with time series data.</p>\n <p><b>Conclusions:</b> Enhancing the model’s ability to capture short-term time dependencies in time series data and periodic refitting with accumulative data were both capable of mitigating the model drift. The best improvement of model performance was observed in the combination of these two strategies.</p>\n </div>","PeriodicalId":14089,"journal":{"name":"International Journal of Intelligent Systems","volume":"2025 1","pages":""},"PeriodicalIF":5.0000,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1155/int/2240862","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Intelligent Systems","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1155/int/2240862","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Model drift is a major challenge for applications of clinical prediction models. We aimed to investigate the effect of two strategies to mitigate model drift based on a previously reported prediction model for acute kidney injury (AKI).
Methods: Deidentified electronic medical data of inpatients in Sichuan Provincial People’s Hospital from January 1, 2019, to December 31, 2022, were collected. AKI was defined by the KDIGO criteria. The top 50 laboratory variables, alongside with sex, age, and the top 20 prescribed medicines were included as predictive variables. In model optimization, the convolution neural network module was replaced by a self-attention module. Periodical refitting with accumulative data was also conducted before temporally external validations. The performance of the innovated model (ATRN) was compared with the previous model (ATCN) and other four models.
Results: A total of 150,373 admissions were identified. The annual incidences of AKI varied between 5.57% and 5.8%. The performance of the models which had used temporal features profoundly declined over time. The ATRN model with module more suitable to capture short-term time dependencies outperformed the other five models both in C-statistics and recall rates perspectives. Periodic refitting the prediction model with accumulative data also helped to effectively mitigate the model drift, especially in models with time series data.
Conclusions: Enhancing the model’s ability to capture short-term time dependencies in time series data and periodic refitting with accumulative data were both capable of mitigating the model drift. The best improvement of model performance was observed in the combination of these two strategies.
期刊介绍:
The International Journal of Intelligent Systems serves as a forum for individuals interested in tapping into the vast theories based on intelligent systems construction. With its peer-reviewed format, the journal explores several fascinating editorials written by today''s experts in the field. Because new developments are being introduced each day, there''s much to be learned — examination, analysis creation, information retrieval, man–computer interactions, and more. The International Journal of Intelligent Systems uses charts and illustrations to demonstrate these ground-breaking issues, and encourages readers to share their thoughts and experiences.