{"title":"Predicting Axillary Lymph Node Metastasis in Breast Cancer Using Ultrasound and Machine Learning with SHAP.","authors":"Gengyan Bai, Xiaohong Zhong, Youping Wu, Weijie Lin, Shoulan Zhou, Ping Zhou","doi":"10.2147/CMAR.S542680","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Accurate preoperative prediction of axillary lymph node (ALN) metastasis in breast cancer is crucial for surgical planning and reducing morbidity. Conventional ultrasound and Doppler methods are limited by subjectivity, while existing machine learning (ML) models often lack interpretability and multi-center validation.</p><p><strong>Aim: </strong>To evaluate 11 ML algorithms and develop a validated model integrating ultrasound and Doppler features for ALN metastasis prediction, using SHapley Additive exPlanations (SHAP) for interpretability.</p><p><strong>Methods: </strong>This retrospective dual-center study included 303 patients from Xiamen (internal cohorts: 212 training, 91 validation) and 102 from Longyan (external validation). Features were extracted from preoperative ultrasound and Doppler images. Recursive feature elimination (RFE) and SHAP selected key predictors. Gradient Boosting was identified as optimal and compared to B-mode/Doppler submodels and clinicopathological scores (Logical, Tumor, Tenon). Performance was assessed via AUC, calibration, decision curve analysis (DCA), and a web calculator was developed.</p><p><strong>Results: </strong>Five features-tumor diameter, cortex-to-hilum ratio, lymph node systolic/diastolic ratio, peak systolic velocity, and end-diastolic velocity-were selected. The combined model achieved AUCs of 0.981 (training), 0.975 (internal validation), and 0.987 (external validation), outperforming scores (AUCs 0.517-0.700). It showed superior calibration (Brier scores 0.045-0.061) and net benefit in DCA.</p><p><strong>Conclusion: </strong>The Gradient Boosting model with SHAP provides accurate, interpretable ALN metastasis prediction, supporting noninvasive risk stratification and personalized breast cancer management.</p>","PeriodicalId":9479,"journal":{"name":"Cancer Management and Research","volume":"17 ","pages":"2183-2197"},"PeriodicalIF":2.6000,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12482940/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cancer Management and Research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.2147/CMAR.S542680","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q3","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Accurate preoperative prediction of axillary lymph node (ALN) metastasis in breast cancer is crucial for surgical planning and reducing morbidity. Conventional ultrasound and Doppler methods are limited by subjectivity, while existing machine learning (ML) models often lack interpretability and multi-center validation.
Aim: To evaluate 11 ML algorithms and develop a validated model integrating ultrasound and Doppler features for ALN metastasis prediction, using SHapley Additive exPlanations (SHAP) for interpretability.
Methods: This retrospective dual-center study included 303 patients from Xiamen (internal cohorts: 212 training, 91 validation) and 102 from Longyan (external validation). Features were extracted from preoperative ultrasound and Doppler images. Recursive feature elimination (RFE) and SHAP selected key predictors. Gradient Boosting was identified as optimal and compared to B-mode/Doppler submodels and clinicopathological scores (Logical, Tumor, Tenon). Performance was assessed via AUC, calibration, decision curve analysis (DCA), and a web calculator was developed.
Results: Five features-tumor diameter, cortex-to-hilum ratio, lymph node systolic/diastolic ratio, peak systolic velocity, and end-diastolic velocity-were selected. The combined model achieved AUCs of 0.981 (training), 0.975 (internal validation), and 0.987 (external validation), outperforming scores (AUCs 0.517-0.700). It showed superior calibration (Brier scores 0.045-0.061) and net benefit in DCA.
Conclusion: The Gradient Boosting model with SHAP provides accurate, interpretable ALN metastasis prediction, supporting noninvasive risk stratification and personalized breast cancer management.
期刊介绍:
Cancer Management and Research is an international, peer reviewed, open access journal focusing on cancer research and the optimal use of preventative and integrated treatment interventions to achieve improved outcomes, enhanced survival, and quality of life for cancer patients. Specific topics covered in the journal include:
◦Epidemiology, detection and screening
◦Cellular research and biomarkers
◦Identification of biotargets and agents with novel mechanisms of action
◦Optimal clinical use of existing anticancer agents, including combination therapies
◦Radiation and surgery
◦Palliative care
◦Patient adherence, quality of life, satisfaction
The journal welcomes submitted papers covering original research, basic science, clinical & epidemiological studies, reviews & evaluations, guidelines, expert opinion and commentary, and case series that shed novel insights on a disease or disease subtype.