开发和验证机器学习模型,预测未来 12 个月内糖尿病患者的意外住院情况

IF 0.7 Q4 ENDOCRINOLOGY & METABOLISM
Diabetes Mellitus Pub Date : 2024-05-06 DOI:10.14341/dm13065
A. Andreychenko, A. D. Ermak, D. V. Gavrilov, R. Novitskiy, A. V. Gusev
{"title":"开发和验证机器学习模型,预测未来 12 个月内糖尿病患者的意外住院情况","authors":"A. Andreychenko, A. D. Ermak, D. V. Gavrilov, R. Novitskiy, A. V. Gusev","doi":"10.14341/dm13065","DOIUrl":null,"url":null,"abstract":"BACKGROUND: The incidence of diabetes mellitus (DM) both in the Russian Federation and in the world has been steadily increasing for several decades. Stable population growth and current epidemiological characteristics of DM lead to enormous economic costs and significant social losses throughout the world. The disease often progresses with the development of specific complications, while significantly increasing the likelihood of hospitalization. The creation and inference of a machine learning model for predicting hospitalizations of patients with DM to an inpatient medical facility will make it possible to personalize the provision of medical care and optimize the load on the entire healthcare system.AIM: Development and validation of models for predicting unplanned hospitalizations of patients with diabetes due to the disease itself and its complications using machine learning algorithms and data from real clinical practice.MATERIALS AND METHODS: 170,141 depersonalized electronic health records of 23,742 diabetic patients were included in the study. Anamnestic, constitutional, clinical, instrumental and laboratory data, widely used in routine medical practice, were considered as potential predictors, a total of 33 signs. Logistic regression (LR), gradient boosting methods (LightGBM, XGBoost, CatBoost), decision tree-based methods (RandomForest and ExtraTrees), and a neural network-based algorithm (Multi-layer Perceptron) were compared. External validation was performed on the data of the separate region of Russian Federation.RESULTS: The best results and stability to external validation data were shown by the LightGBM model with an AUC of 0.818 (95% CI 0.802–0.834) in internal testing and 0.802 (95% CI 0.773–0.832) in external validation.CONCLUSION: The metrics of the best model were superior to previously published studies. The results of external validation showed the relative stability of the model to new data from another region, that reflects the possibility of the model’s application in real clinical practice.","PeriodicalId":11327,"journal":{"name":"Diabetes Mellitus","volume":null,"pages":null},"PeriodicalIF":0.7000,"publicationDate":"2024-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Development and validation of machine learning models to predict unplanned hospitalizations of patients with diabetes within the next 12 months\",\"authors\":\"A. Andreychenko, A. D. Ermak, D. V. Gavrilov, R. Novitskiy, A. V. Gusev\",\"doi\":\"10.14341/dm13065\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"BACKGROUND: The incidence of diabetes mellitus (DM) both in the Russian Federation and in the world has been steadily increasing for several decades. Stable population growth and current epidemiological characteristics of DM lead to enormous economic costs and significant social losses throughout the world. The disease often progresses with the development of specific complications, while significantly increasing the likelihood of hospitalization. The creation and inference of a machine learning model for predicting hospitalizations of patients with DM to an inpatient medical facility will make it possible to personalize the provision of medical care and optimize the load on the entire healthcare system.AIM: Development and validation of models for predicting unplanned hospitalizations of patients with diabetes due to the disease itself and its complications using machine learning algorithms and data from real clinical practice.MATERIALS AND METHODS: 170,141 depersonalized electronic health records of 23,742 diabetic patients were included in the study. Anamnestic, constitutional, clinical, instrumental and laboratory data, widely used in routine medical practice, were considered as potential predictors, a total of 33 signs. Logistic regression (LR), gradient boosting methods (LightGBM, XGBoost, CatBoost), decision tree-based methods (RandomForest and ExtraTrees), and a neural network-based algorithm (Multi-layer Perceptron) were compared. External validation was performed on the data of the separate region of Russian Federation.RESULTS: The best results and stability to external validation data were shown by the LightGBM model with an AUC of 0.818 (95% CI 0.802–0.834) in internal testing and 0.802 (95% CI 0.773–0.832) in external validation.CONCLUSION: The metrics of the best model were superior to previously published studies. The results of external validation showed the relative stability of the model to new data from another region, that reflects the possibility of the model’s application in real clinical practice.\",\"PeriodicalId\":11327,\"journal\":{\"name\":\"Diabetes Mellitus\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.7000,\"publicationDate\":\"2024-05-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Diabetes Mellitus\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.14341/dm13065\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"ENDOCRINOLOGY & METABOLISM\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Diabetes Mellitus","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.14341/dm13065","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"ENDOCRINOLOGY & METABOLISM","Score":null,"Total":0}
引用次数: 0

摘要

背景:几十年来,糖尿病(DM)在俄罗斯联邦和全世界的发病率都在稳步上升。稳定的人口增长和目前糖尿病的流行病学特征导致全世界巨大的经济损失和重大的社会损失。这种疾病通常会随着特定并发症的出现而发展,同时大大增加了住院治疗的可能性。目的:利用机器学习算法和真实临床实践数据,开发并验证用于预测糖尿病患者因疾病本身及其并发症而意外住院的模型。常规医疗实践中广泛使用的体征、体质、临床、仪器和实验室数据被视为潜在的预测因素,共计 33 种体征。对逻辑回归(LR)、梯度提升方法(LightGBM、XGBoost、CatBoost)、基于决策树的方法(RandomForest 和 ExtraTrees)以及基于神经网络的算法(多层感知器)进行了比较。结果:LightGBM 模型的内部测试 AUC 为 0.818(95% CI 0.802-0.834),外部验证 AUC 为 0.802(95% CI 0.773-0.832)。外部验证结果表明,该模型对来自其他地区的新数据具有相对稳定性,这反映了该模型在实际临床实践中应用的可能性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Development and validation of machine learning models to predict unplanned hospitalizations of patients with diabetes within the next 12 months
BACKGROUND: The incidence of diabetes mellitus (DM) both in the Russian Federation and in the world has been steadily increasing for several decades. Stable population growth and current epidemiological characteristics of DM lead to enormous economic costs and significant social losses throughout the world. The disease often progresses with the development of specific complications, while significantly increasing the likelihood of hospitalization. The creation and inference of a machine learning model for predicting hospitalizations of patients with DM to an inpatient medical facility will make it possible to personalize the provision of medical care and optimize the load on the entire healthcare system.AIM: Development and validation of models for predicting unplanned hospitalizations of patients with diabetes due to the disease itself and its complications using machine learning algorithms and data from real clinical practice.MATERIALS AND METHODS: 170,141 depersonalized electronic health records of 23,742 diabetic patients were included in the study. Anamnestic, constitutional, clinical, instrumental and laboratory data, widely used in routine medical practice, were considered as potential predictors, a total of 33 signs. Logistic regression (LR), gradient boosting methods (LightGBM, XGBoost, CatBoost), decision tree-based methods (RandomForest and ExtraTrees), and a neural network-based algorithm (Multi-layer Perceptron) were compared. External validation was performed on the data of the separate region of Russian Federation.RESULTS: The best results and stability to external validation data were shown by the LightGBM model with an AUC of 0.818 (95% CI 0.802–0.834) in internal testing and 0.802 (95% CI 0.773–0.832) in external validation.CONCLUSION: The metrics of the best model were superior to previously published studies. The results of external validation showed the relative stability of the model to new data from another region, that reflects the possibility of the model’s application in real clinical practice.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Diabetes Mellitus
Diabetes Mellitus ENDOCRINOLOGY & METABOLISM-
CiteScore
1.90
自引率
40.00%
发文量
61
审稿时长
7 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信