Mengwei Ma, Weimin Xu, Jun Yang, Bowen Zheng, Chanjuan Wen, Sina Wang, Zeyuan Xu, Genggeng Qin, Weiguo Chen
{"title":"Contrast-enhanced mammography-based interpretable machine learning model for the prediction of the molecular subtype breast cancers.","authors":"Mengwei Ma, Weimin Xu, Jun Yang, Bowen Zheng, Chanjuan Wen, Sina Wang, Zeyuan Xu, Genggeng Qin, Weiguo Chen","doi":"10.1186/s12880-025-01765-3","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>This study aims to establish a machine learning prediction model to explore the correlation between contrast-enhanced mammography (CEM) imaging features and molecular subtypes of mass-type breast cancer.</p><p><strong>Materials and methods: </strong>This retrospective study included women with breast cancer who underwent CEM preoperatively between 2018 and 2021. We included 241 patients, which were randomly assigned to either a training or a test set in a 7:3 ratio. Twenty-one features were visually described, including four clinical features and seventeen radiological features, these radiological features which extracted from the CEM. Three binary classifications of subtypes were performed: Luminal vs. non-Luminal, HER2-enriched vs. non-HER2-enriched, and triple-negative (TNBC) vs. non-triple-negative. A multinomial naive Bayes (MNB) machine learning scheme was employed for the classification, and the least absolute shrink age and selection operator method were used to select the most predictive features for the classifiers. The classification performance was evaluated using the area under the receiver operating characteristic curve. We also utilized SHapley Additive exPlanation (SHAP) values to explain the prediction model.</p><p><strong>Results: </strong>The model that used a combination of low energy (LE) and dual-energy subtraction (DES) achieved the best performance compared to using either of the two images alone, yielding an area under the receiver operating characteristic curve of 0.798 for Luminal vs. non-Luminal subtypes, 0.695 for TNBC vs. non-TNBC, and 0.773 for HER2-enriched vs. non-HER2-enriched. The SHAP algorithm shows that \"LE_mass_margin_spiculated,\" \"DES_mass_enhanced_margin_spiculated,\" and \"DES_mass_internal_enhancement_homogeneous\" have the most significant impact on the model's performance in predicting Luminal and non-Luminal breast cancer. \"mass_calcification_relationship_no,\" \"calcification_ type_no,\" and \"LE_mass_margin_spiculated\" have a considerable impact on the model's performance in predicting HER2 and non-HER2 breast cancer.</p><p><strong>Conclusions: </strong>The radiological characteristics of breast tumors extracted from CEM were found to be associated with breast cancer subtypes in our study. Future research is needed to validate these findings.</p>","PeriodicalId":9020,"journal":{"name":"BMC Medical Imaging","volume":"25 1","pages":"255"},"PeriodicalIF":3.2000,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12220444/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medical Imaging","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12880-025-01765-3","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0
Abstract
Objective: This study aims to establish a machine learning prediction model to explore the correlation between contrast-enhanced mammography (CEM) imaging features and molecular subtypes of mass-type breast cancer.
Materials and methods: This retrospective study included women with breast cancer who underwent CEM preoperatively between 2018 and 2021. We included 241 patients, which were randomly assigned to either a training or a test set in a 7:3 ratio. Twenty-one features were visually described, including four clinical features and seventeen radiological features, these radiological features which extracted from the CEM. Three binary classifications of subtypes were performed: Luminal vs. non-Luminal, HER2-enriched vs. non-HER2-enriched, and triple-negative (TNBC) vs. non-triple-negative. A multinomial naive Bayes (MNB) machine learning scheme was employed for the classification, and the least absolute shrink age and selection operator method were used to select the most predictive features for the classifiers. The classification performance was evaluated using the area under the receiver operating characteristic curve. We also utilized SHapley Additive exPlanation (SHAP) values to explain the prediction model.
Results: The model that used a combination of low energy (LE) and dual-energy subtraction (DES) achieved the best performance compared to using either of the two images alone, yielding an area under the receiver operating characteristic curve of 0.798 for Luminal vs. non-Luminal subtypes, 0.695 for TNBC vs. non-TNBC, and 0.773 for HER2-enriched vs. non-HER2-enriched. The SHAP algorithm shows that "LE_mass_margin_spiculated," "DES_mass_enhanced_margin_spiculated," and "DES_mass_internal_enhancement_homogeneous" have the most significant impact on the model's performance in predicting Luminal and non-Luminal breast cancer. "mass_calcification_relationship_no," "calcification_ type_no," and "LE_mass_margin_spiculated" have a considerable impact on the model's performance in predicting HER2 and non-HER2 breast cancer.
Conclusions: The radiological characteristics of breast tumors extracted from CEM were found to be associated with breast cancer subtypes in our study. Future research is needed to validate these findings.
期刊介绍:
BMC Medical Imaging is an open access journal publishing original peer-reviewed research articles in the development, evaluation, and use of imaging techniques and image processing tools to diagnose and manage disease.