{"title":"基于机器学习的肾移植后血红蛋白浓度预测新方法:预测模型建立及方法优化。","authors":"Songping He, Xiangxi Li, Fangyu Peng, Jiazhi Liao, Xia Lu, Hui Guo, Xin Tan, Yanyan Chen","doi":"10.1186/s12911-025-03060-1","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Anaemia is a common complication after kidney transplantation, and the haemoglobin concentration is one of the main criteria for identifying anaemia. Moreover, artificial intelligence methods have developed rapidly in recent years, are widely used in the medical field and have achieved good results.</p><p><strong>Objective: </strong>To optimize the process of constructing a clinical prediction model based on machine learning and improve related technologies. A classification prediction model for the haemoglobin concentration after kidney transplantation was constructed.</p><p><strong>Methods: </strong>Real-world data from 854 kidney transplant patients in a Grade A tertiary hospital were retrospectively extracted. An imputation method combining the K-nearest neighbour algorithm and multilayer perceptron was used to fill in missing values in the dataset. Recursive feature elimination and extreme gradient boosting were used to rank and screen the importance of patient features and reduce the dimensionality of the features. Before the classification prediction model was established, the number of classification categories was determined first, and the optimal ideal cluster was approximated by the ideal cluster under each classification number and the similarity between the ideal cluster and the actual cluster. Finally, five kinds of machine learning methods, random forest, extreme gradient boosting, light gradient boosting machine, linear support vector classifier and support vector machine, were used to establish classification prediction models, and error-correcting output codes were used to optimize each model. A classification prediction model for abnormal haemoglobin concentrations after kidney transplantation was constructed, and the prediction effect was verified.</p><p><strong>Results: </strong>The imputation method combining the K-nearest neighbour algorithm and multilayer perceptron has a better effect on the imputation of missing values than do the commonly used imputation methods. Among the machine learning methods used for modelling, the prediction results of the tree model are improved to a certain degree after the error-correcting output code optimization. The final model with the best effect is optimized extreme gradient boosting, and the prediction accuracies before and after model optimization are 85.98% and 87.22%, respectively.</p><p><strong>Conclusions: </strong>The accuracy of the machine learning classification prediction model established by the optimized modelling method and process reached 87.22%, which can assist doctors in preoperative risk prediction.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":"25 1","pages":"255"},"PeriodicalIF":3.3000,"publicationDate":"2025-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12236034/pdf/","citationCount":"0","resultStr":"{\"title\":\"A novel method to predict the haemoglobin concentration after kidney transplantation based on machine learning: prediction model establishment and method optimization.\",\"authors\":\"Songping He, Xiangxi Li, Fangyu Peng, Jiazhi Liao, Xia Lu, Hui Guo, Xin Tan, Yanyan Chen\",\"doi\":\"10.1186/s12911-025-03060-1\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Anaemia is a common complication after kidney transplantation, and the haemoglobin concentration is one of the main criteria for identifying anaemia. Moreover, artificial intelligence methods have developed rapidly in recent years, are widely used in the medical field and have achieved good results.</p><p><strong>Objective: </strong>To optimize the process of constructing a clinical prediction model based on machine learning and improve related technologies. A classification prediction model for the haemoglobin concentration after kidney transplantation was constructed.</p><p><strong>Methods: </strong>Real-world data from 854 kidney transplant patients in a Grade A tertiary hospital were retrospectively extracted. An imputation method combining the K-nearest neighbour algorithm and multilayer perceptron was used to fill in missing values in the dataset. Recursive feature elimination and extreme gradient boosting were used to rank and screen the importance of patient features and reduce the dimensionality of the features. Before the classification prediction model was established, the number of classification categories was determined first, and the optimal ideal cluster was approximated by the ideal cluster under each classification number and the similarity between the ideal cluster and the actual cluster. Finally, five kinds of machine learning methods, random forest, extreme gradient boosting, light gradient boosting machine, linear support vector classifier and support vector machine, were used to establish classification prediction models, and error-correcting output codes were used to optimize each model. A classification prediction model for abnormal haemoglobin concentrations after kidney transplantation was constructed, and the prediction effect was verified.</p><p><strong>Results: </strong>The imputation method combining the K-nearest neighbour algorithm and multilayer perceptron has a better effect on the imputation of missing values than do the commonly used imputation methods. Among the machine learning methods used for modelling, the prediction results of the tree model are improved to a certain degree after the error-correcting output code optimization. The final model with the best effect is optimized extreme gradient boosting, and the prediction accuracies before and after model optimization are 85.98% and 87.22%, respectively.</p><p><strong>Conclusions: </strong>The accuracy of the machine learning classification prediction model established by the optimized modelling method and process reached 87.22%, which can assist doctors in preoperative risk prediction.</p>\",\"PeriodicalId\":9340,\"journal\":{\"name\":\"BMC Medical Informatics and Decision Making\",\"volume\":\"25 1\",\"pages\":\"255\"},\"PeriodicalIF\":3.3000,\"publicationDate\":\"2025-07-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12236034/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"BMC Medical Informatics and Decision Making\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1186/s12911-025-03060-1\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MEDICAL INFORMATICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medical Informatics and Decision Making","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12911-025-03060-1","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}
A novel method to predict the haemoglobin concentration after kidney transplantation based on machine learning: prediction model establishment and method optimization.
Background: Anaemia is a common complication after kidney transplantation, and the haemoglobin concentration is one of the main criteria for identifying anaemia. Moreover, artificial intelligence methods have developed rapidly in recent years, are widely used in the medical field and have achieved good results.
Objective: To optimize the process of constructing a clinical prediction model based on machine learning and improve related technologies. A classification prediction model for the haemoglobin concentration after kidney transplantation was constructed.
Methods: Real-world data from 854 kidney transplant patients in a Grade A tertiary hospital were retrospectively extracted. An imputation method combining the K-nearest neighbour algorithm and multilayer perceptron was used to fill in missing values in the dataset. Recursive feature elimination and extreme gradient boosting were used to rank and screen the importance of patient features and reduce the dimensionality of the features. Before the classification prediction model was established, the number of classification categories was determined first, and the optimal ideal cluster was approximated by the ideal cluster under each classification number and the similarity between the ideal cluster and the actual cluster. Finally, five kinds of machine learning methods, random forest, extreme gradient boosting, light gradient boosting machine, linear support vector classifier and support vector machine, were used to establish classification prediction models, and error-correcting output codes were used to optimize each model. A classification prediction model for abnormal haemoglobin concentrations after kidney transplantation was constructed, and the prediction effect was verified.
Results: The imputation method combining the K-nearest neighbour algorithm and multilayer perceptron has a better effect on the imputation of missing values than do the commonly used imputation methods. Among the machine learning methods used for modelling, the prediction results of the tree model are improved to a certain degree after the error-correcting output code optimization. The final model with the best effect is optimized extreme gradient boosting, and the prediction accuracies before and after model optimization are 85.98% and 87.22%, respectively.
Conclusions: The accuracy of the machine learning classification prediction model established by the optimized modelling method and process reached 87.22%, which can assist doctors in preoperative risk prediction.
期刊介绍:
BMC Medical Informatics and Decision Making is an open access journal publishing original peer-reviewed research articles in relation to the design, development, implementation, use, and evaluation of health information technologies and decision-making for human health.