Jie Cheng , Fei Chen , Yunxiao Song , Siyang Wang , Jingying Jia , Hang Wang , Houbao Liu
{"title":"基于全血细胞计数的机器学习模型的开发和验证,以预测尿路上皮癌患者的临床结果。","authors":"Jie Cheng , Fei Chen , Yunxiao Song , Siyang Wang , Jingying Jia , Hang Wang , Houbao Liu","doi":"10.1016/j.cca.2025.120367","DOIUrl":null,"url":null,"abstract":"<div><div>Urothelial carcinoma (UC) is a highly malignant disease with significant public health implications. Despite advancements in oncology, early diagnosis and effective prognostic tools remain limited. This study aimed to develop a machine learning model using complete blood count (CBC) data to predict clinical outcomes in UC patients. A retrospective, two-center cohort study was conducted, analyzing 23 CBC variables from 477 UC patients at Xuhui Hospital of Fudan University (discovery cohort) and 297 UC patients from Putuo People’s Hospital of Tongji University (validation cohort). CBC data were collected before treatment and three months posttreatment, with overall survival (OS) as the primary endpoint. Nine machine learning models were developed in the discovery cohort and validated independently. Feature selection identified a logistic regression (LR) model incorporating white blood cell (WBC) count and lymphocyte percentage (LYMPH%) as the optimal predictor. The model achieved high performance, with an area under the ROC curve (AUC) of 0.93 (95 %CI: 0.90–0.97), area under the precision-recall curve (AUPRC) of 0.94 (95 %CI: 0.89–0.99), positive predictive value (PPV) of 0.87 (95 %CI: 0.75–0.98), negative predictive value (NPV) of 0.82 (95 %CI: 0.78–0.87), accuracy of 0.83 (95 %CI: 0.80–0.88), and F1 score of 0.82 (95 %CI: 0.79–0.86) in the discovery cohort, and comparable results in the validation cohort (AUC 0.88 [95 %CI: 0.84–0.93], AUPRC 0.81 [95 %CI: 0.75–0.86], PPV 0.77 [95 %CI: 0.71–0.84], NPV 0.89 [95 %CI: 0.84–0.95], accuracy 0.84 [95 %CI: 0.80–0.89], and F1 score 0.80 [95 %CI: 0.74–0.87]). Decision curve analysis demonstrated consistent net benefits, while Kaplan–Meier analysis indicated significantly shorter OS in the “predict worse outcomes” subgroup. Posttreatment, WBC counts increased and LYMPH% decreased in deceased patients, whereas survivors showed the opposite trends (P < 0.05). These findings suggest that a simple, cost-effective CBC-based machine learning model can effectively predict UC prognosis, aiding clinical decision-making.</div></div>","PeriodicalId":10205,"journal":{"name":"Clinica Chimica Acta","volume":"575 ","pages":"Article 120367"},"PeriodicalIF":3.2000,"publicationDate":"2025-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Development and validation of a machine learning model based on complete blood counts to predict clinical outcomes in urothelial carcinoma patients\",\"authors\":\"Jie Cheng , Fei Chen , Yunxiao Song , Siyang Wang , Jingying Jia , Hang Wang , Houbao Liu\",\"doi\":\"10.1016/j.cca.2025.120367\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Urothelial carcinoma (UC) is a highly malignant disease with significant public health implications. Despite advancements in oncology, early diagnosis and effective prognostic tools remain limited. This study aimed to develop a machine learning model using complete blood count (CBC) data to predict clinical outcomes in UC patients. A retrospective, two-center cohort study was conducted, analyzing 23 CBC variables from 477 UC patients at Xuhui Hospital of Fudan University (discovery cohort) and 297 UC patients from Putuo People’s Hospital of Tongji University (validation cohort). CBC data were collected before treatment and three months posttreatment, with overall survival (OS) as the primary endpoint. Nine machine learning models were developed in the discovery cohort and validated independently. Feature selection identified a logistic regression (LR) model incorporating white blood cell (WBC) count and lymphocyte percentage (LYMPH%) as the optimal predictor. The model achieved high performance, with an area under the ROC curve (AUC) of 0.93 (95 %CI: 0.90–0.97), area under the precision-recall curve (AUPRC) of 0.94 (95 %CI: 0.89–0.99), positive predictive value (PPV) of 0.87 (95 %CI: 0.75–0.98), negative predictive value (NPV) of 0.82 (95 %CI: 0.78–0.87), accuracy of 0.83 (95 %CI: 0.80–0.88), and F1 score of 0.82 (95 %CI: 0.79–0.86) in the discovery cohort, and comparable results in the validation cohort (AUC 0.88 [95 %CI: 0.84–0.93], AUPRC 0.81 [95 %CI: 0.75–0.86], PPV 0.77 [95 %CI: 0.71–0.84], NPV 0.89 [95 %CI: 0.84–0.95], accuracy 0.84 [95 %CI: 0.80–0.89], and F1 score 0.80 [95 %CI: 0.74–0.87]). Decision curve analysis demonstrated consistent net benefits, while Kaplan–Meier analysis indicated significantly shorter OS in the “predict worse outcomes” subgroup. Posttreatment, WBC counts increased and LYMPH% decreased in deceased patients, whereas survivors showed the opposite trends (P < 0.05). These findings suggest that a simple, cost-effective CBC-based machine learning model can effectively predict UC prognosis, aiding clinical decision-making.</div></div>\",\"PeriodicalId\":10205,\"journal\":{\"name\":\"Clinica Chimica Acta\",\"volume\":\"575 \",\"pages\":\"Article 120367\"},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2025-05-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Clinica Chimica Acta\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0009898125002463\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MEDICAL LABORATORY TECHNOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Clinica Chimica Acta","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0009898125002463","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICAL LABORATORY TECHNOLOGY","Score":null,"Total":0}
Development and validation of a machine learning model based on complete blood counts to predict clinical outcomes in urothelial carcinoma patients
Urothelial carcinoma (UC) is a highly malignant disease with significant public health implications. Despite advancements in oncology, early diagnosis and effective prognostic tools remain limited. This study aimed to develop a machine learning model using complete blood count (CBC) data to predict clinical outcomes in UC patients. A retrospective, two-center cohort study was conducted, analyzing 23 CBC variables from 477 UC patients at Xuhui Hospital of Fudan University (discovery cohort) and 297 UC patients from Putuo People’s Hospital of Tongji University (validation cohort). CBC data were collected before treatment and three months posttreatment, with overall survival (OS) as the primary endpoint. Nine machine learning models were developed in the discovery cohort and validated independently. Feature selection identified a logistic regression (LR) model incorporating white blood cell (WBC) count and lymphocyte percentage (LYMPH%) as the optimal predictor. The model achieved high performance, with an area under the ROC curve (AUC) of 0.93 (95 %CI: 0.90–0.97), area under the precision-recall curve (AUPRC) of 0.94 (95 %CI: 0.89–0.99), positive predictive value (PPV) of 0.87 (95 %CI: 0.75–0.98), negative predictive value (NPV) of 0.82 (95 %CI: 0.78–0.87), accuracy of 0.83 (95 %CI: 0.80–0.88), and F1 score of 0.82 (95 %CI: 0.79–0.86) in the discovery cohort, and comparable results in the validation cohort (AUC 0.88 [95 %CI: 0.84–0.93], AUPRC 0.81 [95 %CI: 0.75–0.86], PPV 0.77 [95 %CI: 0.71–0.84], NPV 0.89 [95 %CI: 0.84–0.95], accuracy 0.84 [95 %CI: 0.80–0.89], and F1 score 0.80 [95 %CI: 0.74–0.87]). Decision curve analysis demonstrated consistent net benefits, while Kaplan–Meier analysis indicated significantly shorter OS in the “predict worse outcomes” subgroup. Posttreatment, WBC counts increased and LYMPH% decreased in deceased patients, whereas survivors showed the opposite trends (P < 0.05). These findings suggest that a simple, cost-effective CBC-based machine learning model can effectively predict UC prognosis, aiding clinical decision-making.
期刊介绍:
The Official Journal of the International Federation of Clinical Chemistry and Laboratory Medicine (IFCC)
Clinica Chimica Acta is a high-quality journal which publishes original Research Communications in the field of clinical chemistry and laboratory medicine, defined as the diagnostic application of chemistry, biochemistry, immunochemistry, biochemical aspects of hematology, toxicology, and molecular biology to the study of human disease in body fluids and cells.
The objective of the journal is to publish novel information leading to a better understanding of biological mechanisms of human diseases, their prevention, diagnosis, and patient management. Reports of an applied clinical character are also welcome. Papers concerned with normal metabolic processes or with constituents of normal cells or body fluids, such as reports of experimental or clinical studies in animals, are only considered when they are clearly and directly relevant to human disease. Evaluation of commercial products have a low priority for publication, unless they are novel or represent a technological breakthrough. Studies dealing with effects of drugs and natural products and studies dealing with the redox status in various diseases are not within the journal''s scope. Development and evaluation of novel analytical methodologies where applicable to diagnostic clinical chemistry and laboratory medicine, including point-of-care testing, and topics on laboratory management and informatics will also be considered. Studies focused on emerging diagnostic technologies and (big) data analysis procedures including digitalization, mobile Health, and artificial Intelligence applied to Laboratory Medicine are also of interest.