{"title":"Interpretable Machine Learning Models for Predicting Lateral Pelvic Lymph Node Metastasis in Rectal Cancer: A Chinese Multicenter Retrospective Study.","authors":"Tixian Xiao, Wei Zhao, Zhen Sun, Fangze Wei, Fuqiang Zhao, Fei Huang, Zeyu Wu, Junge Bai, Xin Wang, Qian Liu","doi":"10.1200/PO-25-00192","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>Internal iliac and obturator lymph nodes are common sites of metastasis in rectal cancer. This study developed a machine learning (ML) model using clinical data to predict lymph node metastasis and applied the Shapley Additive explanations (SHAP) method for interpretation.</p><p><strong>Materials and methods: </strong>Retrospectively, data from patients with rectal cancer at four Chinese centers-who underwent total mesorectal excision and lateral pelvic lymph node dissection without neoadjuvant therapy-were collected. Two centers provided training/test sets (3:1 ratio) and two centers supplied external validation. Lymph node enlargement was determined by imaging and confirmed by pathology. Five ML models were evaluated by AUC, accuracy, and F1 score. Key features included demographics, tumor stage, tumor-to-anal verge distance, imaging measurements, tumor histological differentiation, preoperative carcinoembryonic antigen, and carbohydrate antigen 19-9. SHAP was used to assess feature importance.</p><p><strong>Results: </strong>Of the 411 cases (174 positives) in the training/test sets and 109 cases (43 positives) in external validation, the random forest (RF) model ranked second in terms of AUC and accuracy in the training set (0.999, 0.995), whereas it achieved the highest AUC and accuracy (0.877 and 0.788) in the test set. In the external validation, the RF model outperformed all other ML models (AUC of 0.899, accuracy of 0.827). Overall, the RF model demonstrates the superior overall performance. According to the SHAP analysis, the most important predictors of internal iliac and obturator lymph node metastasis were, in descending order, the short-axis diameter of enlarged lymph nodes, regional lymph node metastasis, and tumor-to-anal verge distance. At the individual patient level, SHAP force plots provided explanations of the RF model predictions for internal iliac and obturator lymph node metastasis.</p><p><strong>Conclusion: </strong>An interpretable ML model was developed that accurately predicts internal iliac and obturator lymph node metastasis using clinical data. SHAP analysis enhances understanding of feature contributions, supporting personalized treatment planning.</p>","PeriodicalId":14797,"journal":{"name":"JCO precision oncology","volume":"9 ","pages":"e2500192"},"PeriodicalIF":5.6000,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12445183/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JCO precision oncology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1200/PO-25-00192","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/9/17 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose: Internal iliac and obturator lymph nodes are common sites of metastasis in rectal cancer. This study developed a machine learning (ML) model using clinical data to predict lymph node metastasis and applied the Shapley Additive explanations (SHAP) method for interpretation.
Materials and methods: Retrospectively, data from patients with rectal cancer at four Chinese centers-who underwent total mesorectal excision and lateral pelvic lymph node dissection without neoadjuvant therapy-were collected. Two centers provided training/test sets (3:1 ratio) and two centers supplied external validation. Lymph node enlargement was determined by imaging and confirmed by pathology. Five ML models were evaluated by AUC, accuracy, and F1 score. Key features included demographics, tumor stage, tumor-to-anal verge distance, imaging measurements, tumor histological differentiation, preoperative carcinoembryonic antigen, and carbohydrate antigen 19-9. SHAP was used to assess feature importance.
Results: Of the 411 cases (174 positives) in the training/test sets and 109 cases (43 positives) in external validation, the random forest (RF) model ranked second in terms of AUC and accuracy in the training set (0.999, 0.995), whereas it achieved the highest AUC and accuracy (0.877 and 0.788) in the test set. In the external validation, the RF model outperformed all other ML models (AUC of 0.899, accuracy of 0.827). Overall, the RF model demonstrates the superior overall performance. According to the SHAP analysis, the most important predictors of internal iliac and obturator lymph node metastasis were, in descending order, the short-axis diameter of enlarged lymph nodes, regional lymph node metastasis, and tumor-to-anal verge distance. At the individual patient level, SHAP force plots provided explanations of the RF model predictions for internal iliac and obturator lymph node metastasis.
Conclusion: An interpretable ML model was developed that accurately predicts internal iliac and obturator lymph node metastasis using clinical data. SHAP analysis enhances understanding of feature contributions, supporting personalized treatment planning.