基于机器学习的肾移植后血红蛋白浓度预测新方法：预测模型建立及方法优化。

IF 3.3 3区医学 Q2 MEDICAL INFORMATICS

BMC Medical Informatics and Decision Making Pub Date : 2025-07-08 DOI:10.1186/s12911-025-03060-1

Songping He, Xiangxi Li, Fangyu Peng, Jiazhi Liao, Xia Lu, Hui Guo, Xin Tan, Yanyan Chen

{"title":"基于机器学习的肾移植后血红蛋白浓度预测新方法：预测模型建立及方法优化。","authors":"Songping He, Xiangxi Li, Fangyu Peng, Jiazhi Liao, Xia Lu, Hui Guo, Xin Tan, Yanyan Chen","doi":"10.1186/s12911-025-03060-1","DOIUrl":null,"url":null,"abstract":"Background: Anaemia is a common complication after kidney transplantation, and the haemoglobin concentration is one of the main criteria for identifying anaemia. Moreover, artificial intelligence methods have developed rapidly in recent years, are widely used in the medical field and have achieved good results.Objective: To optimize the process of constructing a clinical prediction model based on machine learning and improve related technologies. A classification prediction model for the haemoglobin concentration after kidney transplantation was constructed.Methods: Real-world data from 854 kidney transplant patients in a Grade A tertiary hospital were retrospectively extracted. An imputation method combining the K-nearest neighbour algorithm and multilayer perceptron was used to fill in missing values in the dataset. Recursive feature elimination and extreme gradient boosting were used to rank and screen the importance of patient features and reduce the dimensionality of the features. Before the classification prediction model was established, the number of classification categories was determined first, and the optimal ideal cluster was approximated by the ideal cluster under each classification number and the similarity between the ideal cluster and the actual cluster. Finally, five kinds of machine learning methods, random forest, extreme gradient boosting, light gradient boosting machine, linear support vector classifier and support vector machine, were used to establish classification prediction models, and error-correcting output codes were used to optimize each model. A classification prediction model for abnormal haemoglobin concentrations after kidney transplantation was constructed, and the prediction effect was verified.Results: The imputation method combining the K-nearest neighbour algorithm and multilayer perceptron has a better effect on the imputation of missing values than do the commonly used imputation methods. Among the machine learning methods used for modelling, the prediction results of the tree model are improved to a certain degree after the error-correcting output code optimization. The final model with the best effect is optimized extreme gradient boosting, and the prediction accuracies before and after model optimization are 85.98% and 87.22%, respectively.Conclusions: The accuracy of the machine learning classification prediction model established by the optimized modelling method and process reached 87.22%, which can assist doctors in preoperative risk prediction.","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":"25 1","pages":"255"},"PeriodicalIF":3.3000,"publicationDate":"2025-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12236034/pdf/","citationCount":"0","resultStr":"{\"title\":\"A novel method to predict the haemoglobin concentration after kidney transplantation based on machine learning: prediction model establishment and method optimization.\",\"authors\":\"Songping He, Xiangxi Li, Fangyu Peng, Jiazhi Liao, Xia Lu, Hui Guo, Xin Tan, Yanyan Chen\",\"doi\":\"10.1186/s12911-025-03060-1\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background: Anaemia is a common complication after kidney transplantation, and the haemoglobin concentration is one of the main criteria for identifying anaemia. Moreover, artificial intelligence methods have developed rapidly in recent years, are widely used in the medical field and have achieved good results.Objective: To optimize the process of constructing a clinical prediction model based on machine learning and improve related technologies. A classification prediction model for the haemoglobin concentration after kidney transplantation was constructed.Methods: Real-world data from 854 kidney transplant patients in a Grade A tertiary hospital were retrospectively extracted. An imputation method combining the K-nearest neighbour algorithm and multilayer perceptron was used to fill in missing values in the dataset. Recursive feature elimination and extreme gradient boosting were used to rank and screen the importance of patient features and reduce the dimensionality of the features. Before the classification prediction model was established, the number of classification categories was determined first, and the optimal ideal cluster was approximated by the ideal cluster under each classification number and the similarity between the ideal cluster and the actual cluster. Finally, five kinds of machine learning methods, random forest, extreme gradient boosting, light gradient boosting machine, linear support vector classifier and support vector machine, were used to establish classification prediction models, and error-correcting output codes were used to optimize each model. A classification prediction model for abnormal haemoglobin concentrations after kidney transplantation was constructed, and the prediction effect was verified.Results: The imputation method combining the K-nearest neighbour algorithm and multilayer perceptron has a better effect on the imputation of missing values than do the commonly used imputation methods. Among the machine learning methods used for modelling, the prediction results of the tree model are improved to a certain degree after the error-correcting output code optimization. The final model with the best effect is optimized extreme gradient boosting, and the prediction accuracies before and after model optimization are 85.98% and 87.22%, respectively.Conclusions: The accuracy of the machine learning classification prediction model established by the optimized modelling method and process reached 87.22%, which can assist doctors in preoperative risk prediction.\",\"PeriodicalId\":9340,\"journal\":{\"name\":\"BMC Medical Informatics and Decision Making\",\"volume\":\"25 1\",\"pages\":\"255\"},\"PeriodicalIF\":3.3000,\"publicationDate\":\"2025-07-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12236034/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"BMC Medical Informatics and Decision Making\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1186/s12911-025-03060-1\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MEDICAL INFORMATICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medical Informatics and Decision Making","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12911-025-03060-1","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}

引用次数: 0

摘要

背景：贫血是肾移植术后常见的并发症，血红蛋白浓度是鉴别贫血的主要标准之一。此外，人工智能方法近年来发展迅速，在医疗领域得到了广泛应用，并取得了良好的效果。目的：优化基于机器学习的临床预测模型构建流程，完善相关技术。建立肾移植后血红蛋白浓度的分类预测模型。方法：回顾性提取某三甲医院854例肾移植患者的真实资料。采用k近邻算法和多层感知器相结合的方法对数据集的缺失值进行填充。采用递归特征消去和极值梯度增强对患者特征的重要性进行排序和筛选，降低特征的维数。在建立分类预测模型之前，首先确定分类类别的数量，通过每个分类数量下的理想聚类以及理想聚类与实际聚类的相似度来逼近最优理想聚类。最后，利用随机森林、极端梯度增强、轻梯度增强机、线性支持向量分类器和支持向量机五种机器学习方法建立分类预测模型，并利用纠错输出码对各模型进行优化。构建了肾移植后血红蛋白异常浓度的分类预测模型，并验证了预测效果。结果：结合k近邻算法和多层感知器的补全方法对缺失值的补全效果优于常用的补全方法。在用于建模的机器学习方法中，经过纠错输出代码优化后，树模型的预测结果得到了一定程度的改善。优化后的最终模型效果最佳，模型优化前后的预测精度分别为85.98%和87.22%。结论：通过优化的建模方法和流程建立的机器学习分类预测模型准确率达到87.22%，可辅助医生进行术前风险预测。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A novel method to predict the haemoglobin concentration after kidney transplantation based on machine learning: prediction model establishment and method optimization.

Background: Anaemia is a common complication after kidney transplantation, and the haemoglobin concentration is one of the main criteria for identifying anaemia. Moreover, artificial intelligence methods have developed rapidly in recent years, are widely used in the medical field and have achieved good results.

Objective: To optimize the process of constructing a clinical prediction model based on machine learning and improve related technologies. A classification prediction model for the haemoglobin concentration after kidney transplantation was constructed.

Methods: Real-world data from 854 kidney transplant patients in a Grade A tertiary hospital were retrospectively extracted. An imputation method combining the K-nearest neighbour algorithm and multilayer perceptron was used to fill in missing values in the dataset. Recursive feature elimination and extreme gradient boosting were used to rank and screen the importance of patient features and reduce the dimensionality of the features. Before the classification prediction model was established, the number of classification categories was determined first, and the optimal ideal cluster was approximated by the ideal cluster under each classification number and the similarity between the ideal cluster and the actual cluster. Finally, five kinds of machine learning methods, random forest, extreme gradient boosting, light gradient boosting machine, linear support vector classifier and support vector machine, were used to establish classification prediction models, and error-correcting output codes were used to optimize each model. A classification prediction model for abnormal haemoglobin concentrations after kidney transplantation was constructed, and the prediction effect was verified.

Results: The imputation method combining the K-nearest neighbour algorithm and multilayer perceptron has a better effect on the imputation of missing values than do the commonly used imputation methods. Among the machine learning methods used for modelling, the prediction results of the tree model are improved to a certain degree after the error-correcting output code optimization. The final model with the best effect is optimized extreme gradient boosting, and the prediction accuracies before and after model optimization are 85.98% and 87.22%, respectively.

Conclusions: The accuracy of the machine learning classification prediction model established by the optimized modelling method and process reached 87.22%, which can assist doctors in preoperative risk prediction.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

BMC Medical Informatics and Decision Making 医学-医学：信息

CiteScore

7.20

自引率

5.70%

发文量

297

审稿时长

1 months

期刊介绍： BMC Medical Informatics and Decision Making is an open access journal publishing original peer-reviewed research articles in relation to the design, development, implementation, use, and evaluation of health information technologies and decision-making for human health.