利用堆叠回归的集合学习法高效预测糖尿病肾病

International Journal of Online and Biomedical Engineering (iJOE) Pub Date : 2024-05-21 DOI:10.3991/ijoe.v20i08.48387

L. Muflikhah, Amira G. Nurfansepta, Fitra A. Bachtiar, Dian E. Ratnawati

{"title":"利用堆叠回归的集合学习法高效预测糖尿病肾病","authors":"L. Muflikhah, Amira G. Nurfansepta, Fitra A. Bachtiar, Dian E. Ratnawati","doi":"10.3991/ijoe.v20i08.48387","DOIUrl":null,"url":null,"abstract":"Diabetes may lead to several problems, one of the most prevalent and deadly of which is diabetic nephropathy. Therefore, the condition represents a significant threat to one’s health since it has the potential to cause irreversible harm to the kidneys’ ability to operate. A significant portion of the research that is being conducted now is focused on determining how accurately diabetic people may be predicted to develop kidney illness. Considering this, the research suggests a regression stacking approach for predicting albumin levels. These albumin values will serve as a reference for the incidence of diabetic nephropathy disease. They will be derived from the medical records of patients. The utilization of stacking regression from three different ensemble approaches, using Random Forest and CatBoost regressors, while the Huber algorithm is used as a meta-learner. The accuracy with which the combination of parameters that are employed is determined is a significant factor. It contributes to the high degree of performance that the ensemble approach achieves. Therefore, in this investigation, a grid search was carried out to tune the hyperparameters of both regressor models. We evaluated the performance of the proposed model using accuracy, MAPE, RMSE, and MSE values. The experimental findings demonstrate great performance. Three selected variables including quantitative UACR, semi-quantitative UACR, and urinary creatinine, achieved high performance. Overall, the performance obtained an accuracy rate of more than 98% with an error rate (MAPE, RMSE, and MSE values) of less than 1%. In conclusion, the stack regressor model can be implemented to predict diabetic nephropathy using clinical datasets.","PeriodicalId":507997,"journal":{"name":"International Journal of Online and Biomedical Engineering (iJOE)","volume":"34 41","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"High Performance for Predicting Diabetic Nephropathy Using Stacking Regression of Ensemble Learning Method\",\"authors\":\"L. Muflikhah, Amira G. Nurfansepta, Fitra A. Bachtiar, Dian E. Ratnawati\",\"doi\":\"10.3991/ijoe.v20i08.48387\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Diabetes may lead to several problems, one of the most prevalent and deadly of which is diabetic nephropathy. Therefore, the condition represents a significant threat to one’s health since it has the potential to cause irreversible harm to the kidneys’ ability to operate. A significant portion of the research that is being conducted now is focused on determining how accurately diabetic people may be predicted to develop kidney illness. Considering this, the research suggests a regression stacking approach for predicting albumin levels. These albumin values will serve as a reference for the incidence of diabetic nephropathy disease. They will be derived from the medical records of patients. The utilization of stacking regression from three different ensemble approaches, using Random Forest and CatBoost regressors, while the Huber algorithm is used as a meta-learner. The accuracy with which the combination of parameters that are employed is determined is a significant factor. It contributes to the high degree of performance that the ensemble approach achieves. Therefore, in this investigation, a grid search was carried out to tune the hyperparameters of both regressor models. We evaluated the performance of the proposed model using accuracy, MAPE, RMSE, and MSE values. The experimental findings demonstrate great performance. Three selected variables including quantitative UACR, semi-quantitative UACR, and urinary creatinine, achieved high performance. Overall, the performance obtained an accuracy rate of more than 98% with an error rate (MAPE, RMSE, and MSE values) of less than 1%. In conclusion, the stack regressor model can be implemented to predict diabetic nephropathy using clinical datasets.\",\"PeriodicalId\":507997,\"journal\":{\"name\":\"International Journal of Online and Biomedical Engineering (iJOE)\",\"volume\":\"34 41\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-05-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Online and Biomedical Engineering (iJOE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3991/ijoe.v20i08.48387\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Online and Biomedical Engineering (iJOE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3991/ijoe.v20i08.48387","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

糖尿病可能导致多种问题，其中最普遍和最致命的问题之一就是糖尿病肾病。因此，糖尿病肾病对人的健康构成重大威胁，因为它有可能对肾脏的运作能力造成不可逆转的伤害。目前正在进行的大部分研究都集中在确定糖尿病患者患肾病的预测准确度。考虑到这一点，研究建议采用回归叠加法预测白蛋白水平。这些白蛋白值将作为糖尿病肾病发病率的参考。它们将来自患者的医疗记录。利用随机森林和 CatBoost 回归器等三种不同的集合方法进行堆叠回归，同时将 Huber 算法用作元学习器。确定所采用的参数组合的准确性是一个重要因素。它有助于提高集合方法的性能。因此，在这项研究中，我们采用了网格搜索来调整两个回归模型的超参数。我们使用准确度、MAPE、RMSE 和 MSE 值评估了拟议模型的性能。实验结果表明，该模型的性能非常出色。所选的三个变量，包括定量 UACR、半定量 UACR 和尿肌酐，都达到了很高的性能。总体而言，准确率超过 98%，误差率（MAPE、RMSE 和 MSE 值）小于 1%。总之，堆栈回归模型可用于利用临床数据集预测糖尿病肾病。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

High Performance for Predicting Diabetic Nephropathy Using Stacking Regression of Ensemble Learning Method

Diabetes may lead to several problems, one of the most prevalent and deadly of which is diabetic nephropathy. Therefore, the condition represents a significant threat to one’s health since it has the potential to cause irreversible harm to the kidneys’ ability to operate. A significant portion of the research that is being conducted now is focused on determining how accurately diabetic people may be predicted to develop kidney illness. Considering this, the research suggests a regression stacking approach for predicting albumin levels. These albumin values will serve as a reference for the incidence of diabetic nephropathy disease. They will be derived from the medical records of patients. The utilization of stacking regression from three different ensemble approaches, using Random Forest and CatBoost regressors, while the Huber algorithm is used as a meta-learner. The accuracy with which the combination of parameters that are employed is determined is a significant factor. It contributes to the high degree of performance that the ensemble approach achieves. Therefore, in this investigation, a grid search was carried out to tune the hyperparameters of both regressor models. We evaluated the performance of the proposed model using accuracy, MAPE, RMSE, and MSE values. The experimental findings demonstrate great performance. Three selected variables including quantitative UACR, semi-quantitative UACR, and urinary creatinine, achieved high performance. Overall, the performance obtained an accuracy rate of more than 98% with an error rate (MAPE, RMSE, and MSE values) of less than 1%. In conclusion, the stack regressor model can be implemented to predict diabetic nephropathy using clinical datasets.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Journal of Online and Biomedical Engineering (iJOE)

自引率

0.00%

发文量