Development of a machine learning tool to predict the risk of incident chronic kidney disease using health examination data.

IF 3 3区 医学 Q2 PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH
Frontiers in Public Health Pub Date : 2024-11-01 eCollection Date: 2024-01-01 DOI:10.3389/fpubh.2024.1495054
Yuki Yoshizaki, Kiminori Kato, Kazuya Fujihara, Hirohito Sone, Kohei Akazawa
{"title":"Development of a machine learning tool to predict the risk of incident chronic kidney disease using health examination data.","authors":"Yuki Yoshizaki, Kiminori Kato, Kazuya Fujihara, Hirohito Sone, Kohei Akazawa","doi":"10.3389/fpubh.2024.1495054","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Chronic kidney disease (CKD) is characterized by a decreased glomerular filtration rate or renal injury (especially proteinuria) for at least 3 months. The early detection and treatment of CKD, a major global public health concern, before the onset of symptoms is important. This study aimed to develop machine learning models to predict the risk of developing CKD within 1 and 5 years using health examination data.</p><p><strong>Methods: </strong>Data were collected from patients who underwent annual health examinations between 2017 and 2022. Among the 30,273 participants included in the study, 1,372 had CKD. Demographic characteristics, body mass index, blood pressure, blood and urine test results, and questionnaire responses were used to predict the risk of CKD development at 1 and 5 years. This study examined three outcomes: incident estimated glomerular filtration rate (eGFR) <60 mL/min/1.73 m<sup>2</sup>, the development of proteinuria, and incident eGFR <60 mL/min/1.73 m<sup>2</sup> or the development of proteinuria. Logistic regression (LR), conditional logistic regression, neural network, and recurrent neural network were used to develop the prediction models.</p><p><strong>Results: </strong>All models had predictive values, sensitivities, and specificities >0.8 for predicting the onset of CKD in 1 year when the outcome was eGFR <60 mL/min/1.73 m<sup>2</sup>. The area under the receiver operating characteristic curve (AUROC) was >0.9. With LR and a neural network, the specificities were 0.749 and 0.739 and AUROCs were 0.889 and 0.890, respectively, for predicting onset within 5 years. The AUROCs of most models were approximately 0.65 when the outcome was eGFR <60 mL/min/1.73 m<sup>2</sup> or proteinuria. The predictive performance of all models exhibited a significant decrease when eGFR was not included as an explanatory variable (AUROCs: 0.498-0.732).</p><p><strong>Conclusion: </strong>Machine learning models can predict the risk of CKD, and eGFR plays a crucial role in predicting the onset of CKD. However, it is difficult to predict the onset of proteinuria based solely on health examination data. Further studies must be conducted to predict the decline in eGFR and increase in urine protein levels.</p>","PeriodicalId":12548,"journal":{"name":"Frontiers in Public Health","volume":"12 ","pages":"1495054"},"PeriodicalIF":3.0000,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11566449/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Public Health","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.3389/fpubh.2024.1495054","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Chronic kidney disease (CKD) is characterized by a decreased glomerular filtration rate or renal injury (especially proteinuria) for at least 3 months. The early detection and treatment of CKD, a major global public health concern, before the onset of symptoms is important. This study aimed to develop machine learning models to predict the risk of developing CKD within 1 and 5 years using health examination data.

Methods: Data were collected from patients who underwent annual health examinations between 2017 and 2022. Among the 30,273 participants included in the study, 1,372 had CKD. Demographic characteristics, body mass index, blood pressure, blood and urine test results, and questionnaire responses were used to predict the risk of CKD development at 1 and 5 years. This study examined three outcomes: incident estimated glomerular filtration rate (eGFR) <60 mL/min/1.73 m2, the development of proteinuria, and incident eGFR <60 mL/min/1.73 m2 or the development of proteinuria. Logistic regression (LR), conditional logistic regression, neural network, and recurrent neural network were used to develop the prediction models.

Results: All models had predictive values, sensitivities, and specificities >0.8 for predicting the onset of CKD in 1 year when the outcome was eGFR <60 mL/min/1.73 m2. The area under the receiver operating characteristic curve (AUROC) was >0.9. With LR and a neural network, the specificities were 0.749 and 0.739 and AUROCs were 0.889 and 0.890, respectively, for predicting onset within 5 years. The AUROCs of most models were approximately 0.65 when the outcome was eGFR <60 mL/min/1.73 m2 or proteinuria. The predictive performance of all models exhibited a significant decrease when eGFR was not included as an explanatory variable (AUROCs: 0.498-0.732).

Conclusion: Machine learning models can predict the risk of CKD, and eGFR plays a crucial role in predicting the onset of CKD. However, it is difficult to predict the onset of proteinuria based solely on health examination data. Further studies must be conducted to predict the decline in eGFR and increase in urine protein levels.

开发一种机器学习工具,利用健康检查数据预测慢性肾病的发病风险。
背景:慢性肾脏病(CKD)的特征是肾小球滤过率下降或肾损伤(尤其是蛋白尿)至少持续 3 个月。慢性肾脏病是全球关注的重大公共卫生问题,在症状出现之前及早发现和治疗非常重要。本研究旨在开发机器学习模型,利用健康检查数据预测1年和5年内罹患慢性肾脏病的风险:数据收集自2017年至2022年期间接受年度健康检查的患者。在纳入研究的30273名参与者中,有1372人患有慢性肾脏病。人口统计学特征、体重指数、血压、血液和尿液检验结果以及问卷回答被用来预测1年和5年后发生慢性肾脏病的风险。该研究对三种结果进行了检测:事件性估计肾小球滤过率(eGFR)2、蛋白尿的发生,以及事件性 eGFR 2 或蛋白尿的发生。采用逻辑回归(LR)、条件逻辑回归、神经网络和循环神经网络建立预测模型:当预测结果为 eGFR 2 时,所有模型的预测值、灵敏度和特异性均大于 0.8。使用 LR 和神经网络预测 5 年内发病的特异性分别为 0.749 和 0.739,AUROC 分别为 0.889 和 0.890。当结果为 eGFR 2 或蛋白尿时,大多数模型的 AUROC 约为 0.65。当不将 eGFR 作为解释变量时,所有模型的预测性能均显著下降(AUROCs:0.498-0.732):机器学习模型可以预测 CKD 的风险,而 eGFR 在预测 CKD 的发病中起着至关重要的作用。然而,仅凭健康检查数据很难预测蛋白尿的发病。必须开展进一步的研究,以预测 eGFR 的下降和尿蛋白水平的升高。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Frontiers in Public Health
Frontiers in Public Health Medicine-Public Health, Environmental and Occupational Health
CiteScore
4.80
自引率
7.70%
发文量
4469
审稿时长
14 weeks
期刊介绍: Frontiers in Public Health is a multidisciplinary open-access journal which publishes rigorously peer-reviewed research and is at the forefront of disseminating and communicating scientific knowledge and impactful discoveries to researchers, academics, clinicians, policy makers and the public worldwide. The journal aims at overcoming current fragmentation in research and publication, promoting consistency in pursuing relevant scientific themes, and supporting finding dissemination and translation into practice. Frontiers in Public Health is organized into Specialty Sections that cover different areas of research in the field. Please refer to the author guidelines for details on article types and the submission process.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信