Development of a machine learning-based prognostic model for survival prediction in patients with lung cancer brain metastases using multicenter clinical data
IF 4.1 2区 医学Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS
Yuyan Xie, Xuqin Xiang, Menglin Fan, Hongyan Li, Lijuan Du, Weitong Gao, Tong Chen, Zhihao Shi, Xinqi Yu, Fang Liu
{"title":"Development of a machine learning-based prognostic model for survival prediction in patients with lung cancer brain metastases using multicenter clinical data","authors":"Yuyan Xie, Xuqin Xiang, Menglin Fan, Hongyan Li, Lijuan Du, Weitong Gao, Tong Chen, Zhihao Shi, Xinqi Yu, Fang Liu","doi":"10.1016/j.ijmedinf.2025.106025","DOIUrl":null,"url":null,"abstract":"<div><h3>Methods</h3><div>Accurate prognosis prediction for lung cancer brain metastasis (LCBM) patients is critical for clinical decision-making. This study integrates data from the SEER database (n = 2624) and Harbin Medical University Cancer Hospital (n = 362) to develop a machine learning-based prognostic prediction tool. Prognostic factors were selected through Cox regression analysis, and eight prediction models, including XGBoost, Random Forest, and Logistic Regression, were constructed. Performance was evaluated using AUC, learning curves, and PR curves, while the impact of lymph node metastasis was explored through propensity score matching and Kaplan-Meier survival analysis.</div></div><div><h3>Results</h3><div>Risk factors identified included age ≥60 years, T3 stage, and multiple organ metastases, while protective factors included female gender and household income ≥$100,000. The XGBoost model demonstrated superior performance, with mean AUCs of 0.957 (Model 1) and 0.550 (Model 2). The XGBoost-Surv model showed stable performance in both the training set (C-index = 0.653, AUC = 0.731) and the test set (C-index = 0.634, AUC = 0.705). Lymph node metastasis significantly affected prognosis (<em>p</em> < 0.001), though differences in metastatic stages were not statistically significant (<em>p</em> = 0.935).</div></div><div><h3>Conclusion</h3><div>The XGBoost model developed from multicenter data effectively predicts survival outcomes in LCBM patients, with lymph node metastasis serving as an independent prognostic indicator. This model provides a reliable tool for personalized treatment decision-making.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"203 ","pages":"Article 106025"},"PeriodicalIF":4.1000,"publicationDate":"2025-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Medical Informatics","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1386505625002424","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Methods
Accurate prognosis prediction for lung cancer brain metastasis (LCBM) patients is critical for clinical decision-making. This study integrates data from the SEER database (n = 2624) and Harbin Medical University Cancer Hospital (n = 362) to develop a machine learning-based prognostic prediction tool. Prognostic factors were selected through Cox regression analysis, and eight prediction models, including XGBoost, Random Forest, and Logistic Regression, were constructed. Performance was evaluated using AUC, learning curves, and PR curves, while the impact of lymph node metastasis was explored through propensity score matching and Kaplan-Meier survival analysis.
Results
Risk factors identified included age ≥60 years, T3 stage, and multiple organ metastases, while protective factors included female gender and household income ≥$100,000. The XGBoost model demonstrated superior performance, with mean AUCs of 0.957 (Model 1) and 0.550 (Model 2). The XGBoost-Surv model showed stable performance in both the training set (C-index = 0.653, AUC = 0.731) and the test set (C-index = 0.634, AUC = 0.705). Lymph node metastasis significantly affected prognosis (p < 0.001), though differences in metastatic stages were not statistically significant (p = 0.935).
Conclusion
The XGBoost model developed from multicenter data effectively predicts survival outcomes in LCBM patients, with lymph node metastasis serving as an independent prognostic indicator. This model provides a reliable tool for personalized treatment decision-making.
期刊介绍:
International Journal of Medical Informatics provides an international medium for dissemination of original results and interpretative reviews concerning the field of medical informatics. The Journal emphasizes the evaluation of systems in healthcare settings.
The scope of journal covers:
Information systems, including national or international registration systems, hospital information systems, departmental and/or physician''s office systems, document handling systems, electronic medical record systems, standardization, systems integration etc.;
Computer-aided medical decision support systems using heuristic, algorithmic and/or statistical methods as exemplified in decision theory, protocol development, artificial intelligence, etc.
Educational computer based programs pertaining to medical informatics or medicine in general;
Organizational, economic, social, clinical impact, ethical and cost-benefit aspects of IT applications in health care.