Dingyi Zhang, Mengyi Shen, Li Zhang, Xin He, Xiaohua Huang
{"title":"建立一个可解释的基于MRI放射学的机器学习模型,能够预测浸润性乳腺癌腋窝淋巴结转移。","authors":"Dingyi Zhang, Mengyi Shen, Li Zhang, Xin He, Xiaohua Huang","doi":"10.1038/s41598-025-10818-0","DOIUrl":null,"url":null,"abstract":"<p><p>This study sought to develop a radiomics model capable of predicting axillary lymph node metastasis (ALNM) in patients with invasive breast cancer (IBC) based on dual-sequence magnetic resonance imaging(MRI) of diffusion-weighted imaging (DWI) and dynamic contrast enhancement (DCE) data. The interpretability of the resultant model was probed with the SHAP (Shapley Additive Explanations) method. Established inclusion/exclusion criteria were used to retrospectively compile MRI and matching clinical data from 183 patients with pathologically confirmed IBC from our hospital evaluated between June 2021 and December 2023. All of these patients had undergone plain and enhanced MRI scans prior to treatment. These patients were separated according to their pathological biopsy results into those with ALNM (n = 107) and those without ALNM (n = 76). These patients were then randomized into training (n = 128) and testing (n = 55) cohorts at a 7:3 ratio. Optimal radiomics features were selected from the extracted data. The random forest method was used to establish three predictive models (DWI, DCE, and combined DWI + DCE sequence models). Area under the curve (AUC) values for receiver operating characteristic (ROC) curves were utilized to assess model performance. The DeLong test was utilized to compare model predictive efficacy. Model discrimination was assessed based on the integrated discrimination improvement (IDI) method. Decision curves revealed net clinical benefits for each of these models. The SHAP method was used to achieve the best model interpretability. Clinicopathological characteristics (age, menopausal status, molecular subtypes, and estrogen receptor, progesterone receptor, human epidermal growth factor receptor 2, and Ki-67 status) were comparable when comparing the ALNM and non-ALNM groups as well as the training and testing cohorts (P > 0.05). AUC values for the DWI, DCE, and combined models in the training cohort were 0.793, 0.774, and 0.864, respectively, with corresponding values of 0.728, 0.760, and 0.859 in the testing cohort. The predictive efficacy of the DWI and combined models was found to differ significantly according to the DeLong test, as did the predictive efficacy of the DCE and combined models in the training groups (P < 0.05), while no other significant differences were noted in model performance (P > 0.05). IDI results indicated that the combined model offered predictive power levels that were 13.5% (P < 0.05) and 10.2% (P < 0.05) higher than those for the respective DWI and DCE models. In a decision curve analysis, the combined model offered a net clinical benefit over the DCE model. The combined dual-sequence MRI-based radiomics model constructed herein and the supporting interpretability analyses can aid in the prediction of the ALNM status of IBC patients, helping to guide clinical decision-making in these cases.</p>","PeriodicalId":21811,"journal":{"name":"Scientific Reports","volume":"15 1","pages":"26030"},"PeriodicalIF":3.9000,"publicationDate":"2025-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12271558/pdf/","citationCount":"0","resultStr":"{\"title\":\"Establishment of an interpretable MRI radiomics-based machine learning model capable of predicting axillary lymph node metastasis in invasive breast cancer.\",\"authors\":\"Dingyi Zhang, Mengyi Shen, Li Zhang, Xin He, Xiaohua Huang\",\"doi\":\"10.1038/s41598-025-10818-0\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>This study sought to develop a radiomics model capable of predicting axillary lymph node metastasis (ALNM) in patients with invasive breast cancer (IBC) based on dual-sequence magnetic resonance imaging(MRI) of diffusion-weighted imaging (DWI) and dynamic contrast enhancement (DCE) data. The interpretability of the resultant model was probed with the SHAP (Shapley Additive Explanations) method. Established inclusion/exclusion criteria were used to retrospectively compile MRI and matching clinical data from 183 patients with pathologically confirmed IBC from our hospital evaluated between June 2021 and December 2023. All of these patients had undergone plain and enhanced MRI scans prior to treatment. These patients were separated according to their pathological biopsy results into those with ALNM (n = 107) and those without ALNM (n = 76). These patients were then randomized into training (n = 128) and testing (n = 55) cohorts at a 7:3 ratio. Optimal radiomics features were selected from the extracted data. The random forest method was used to establish three predictive models (DWI, DCE, and combined DWI + DCE sequence models). Area under the curve (AUC) values for receiver operating characteristic (ROC) curves were utilized to assess model performance. The DeLong test was utilized to compare model predictive efficacy. Model discrimination was assessed based on the integrated discrimination improvement (IDI) method. Decision curves revealed net clinical benefits for each of these models. The SHAP method was used to achieve the best model interpretability. Clinicopathological characteristics (age, menopausal status, molecular subtypes, and estrogen receptor, progesterone receptor, human epidermal growth factor receptor 2, and Ki-67 status) were comparable when comparing the ALNM and non-ALNM groups as well as the training and testing cohorts (P > 0.05). AUC values for the DWI, DCE, and combined models in the training cohort were 0.793, 0.774, and 0.864, respectively, with corresponding values of 0.728, 0.760, and 0.859 in the testing cohort. The predictive efficacy of the DWI and combined models was found to differ significantly according to the DeLong test, as did the predictive efficacy of the DCE and combined models in the training groups (P < 0.05), while no other significant differences were noted in model performance (P > 0.05). IDI results indicated that the combined model offered predictive power levels that were 13.5% (P < 0.05) and 10.2% (P < 0.05) higher than those for the respective DWI and DCE models. In a decision curve analysis, the combined model offered a net clinical benefit over the DCE model. The combined dual-sequence MRI-based radiomics model constructed herein and the supporting interpretability analyses can aid in the prediction of the ALNM status of IBC patients, helping to guide clinical decision-making in these cases.</p>\",\"PeriodicalId\":21811,\"journal\":{\"name\":\"Scientific Reports\",\"volume\":\"15 1\",\"pages\":\"26030\"},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2025-07-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12271558/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Scientific Reports\",\"FirstCategoryId\":\"103\",\"ListUrlMain\":\"https://doi.org/10.1038/s41598-025-10818-0\",\"RegionNum\":2,\"RegionCategory\":\"综合性期刊\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Scientific Reports","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1038/s41598-025-10818-0","RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
Establishment of an interpretable MRI radiomics-based machine learning model capable of predicting axillary lymph node metastasis in invasive breast cancer.
This study sought to develop a radiomics model capable of predicting axillary lymph node metastasis (ALNM) in patients with invasive breast cancer (IBC) based on dual-sequence magnetic resonance imaging(MRI) of diffusion-weighted imaging (DWI) and dynamic contrast enhancement (DCE) data. The interpretability of the resultant model was probed with the SHAP (Shapley Additive Explanations) method. Established inclusion/exclusion criteria were used to retrospectively compile MRI and matching clinical data from 183 patients with pathologically confirmed IBC from our hospital evaluated between June 2021 and December 2023. All of these patients had undergone plain and enhanced MRI scans prior to treatment. These patients were separated according to their pathological biopsy results into those with ALNM (n = 107) and those without ALNM (n = 76). These patients were then randomized into training (n = 128) and testing (n = 55) cohorts at a 7:3 ratio. Optimal radiomics features were selected from the extracted data. The random forest method was used to establish three predictive models (DWI, DCE, and combined DWI + DCE sequence models). Area under the curve (AUC) values for receiver operating characteristic (ROC) curves were utilized to assess model performance. The DeLong test was utilized to compare model predictive efficacy. Model discrimination was assessed based on the integrated discrimination improvement (IDI) method. Decision curves revealed net clinical benefits for each of these models. The SHAP method was used to achieve the best model interpretability. Clinicopathological characteristics (age, menopausal status, molecular subtypes, and estrogen receptor, progesterone receptor, human epidermal growth factor receptor 2, and Ki-67 status) were comparable when comparing the ALNM and non-ALNM groups as well as the training and testing cohorts (P > 0.05). AUC values for the DWI, DCE, and combined models in the training cohort were 0.793, 0.774, and 0.864, respectively, with corresponding values of 0.728, 0.760, and 0.859 in the testing cohort. The predictive efficacy of the DWI and combined models was found to differ significantly according to the DeLong test, as did the predictive efficacy of the DCE and combined models in the training groups (P < 0.05), while no other significant differences were noted in model performance (P > 0.05). IDI results indicated that the combined model offered predictive power levels that were 13.5% (P < 0.05) and 10.2% (P < 0.05) higher than those for the respective DWI and DCE models. In a decision curve analysis, the combined model offered a net clinical benefit over the DCE model. The combined dual-sequence MRI-based radiomics model constructed herein and the supporting interpretability analyses can aid in the prediction of the ALNM status of IBC patients, helping to guide clinical decision-making in these cases.
期刊介绍:
We publish original research from all areas of the natural sciences, psychology, medicine and engineering. You can learn more about what we publish by browsing our specific scientific subject areas below or explore Scientific Reports by browsing all articles and collections.
Scientific Reports has a 2-year impact factor: 4.380 (2021), and is the 6th most-cited journal in the world, with more than 540,000 citations in 2020 (Clarivate Analytics, 2021).
•Engineering
Engineering covers all aspects of engineering, technology, and applied science. It plays a crucial role in the development of technologies to address some of the world''s biggest challenges, helping to save lives and improve the way we live.
•Physical sciences
Physical sciences are those academic disciplines that aim to uncover the underlying laws of nature — often written in the language of mathematics. It is a collective term for areas of study including astronomy, chemistry, materials science and physics.
•Earth and environmental sciences
Earth and environmental sciences cover all aspects of Earth and planetary science and broadly encompass solid Earth processes, surface and atmospheric dynamics, Earth system history, climate and climate change, marine and freshwater systems, and ecology. It also considers the interactions between humans and these systems.
•Biological sciences
Biological sciences encompass all the divisions of natural sciences examining various aspects of vital processes. The concept includes anatomy, physiology, cell biology, biochemistry and biophysics, and covers all organisms from microorganisms, animals to plants.
•Health sciences
The health sciences study health, disease and healthcare. This field of study aims to develop knowledge, interventions and technology for use in healthcare to improve the treatment of patients.