Jun Du, Jun Shi, Dongdong Sun, Yifei Wang, Guanfeng Liu, Jingru Chen, Wei Wang, Wenchao Zhou, Yushan Zheng, Haibo Wu
{"title":"Machine learning prediction of HER2-low expression in breast cancers based on hematoxylin-eosin-stained slides.","authors":"Jun Du, Jun Shi, Dongdong Sun, Yifei Wang, Guanfeng Liu, Jingru Chen, Wei Wang, Wenchao Zhou, Yushan Zheng, Haibo Wu","doi":"10.1186/s13058-025-01998-8","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Treatment with HER2-targeted therapies is recommended for HER2-positive breast cancer patients with HER2 gene amplification or protein overexpression. Interestingly, recent clinical trials of novel HER2-targeted therapies demonstrated promising efficacy in HER2-low breast cancers, raising the prospect of including a HER2-low category (immunohistochemistry, IHC) score of 1 + or 2 + with non-amplified in-situ hybridization for HER2-targeted treatments, which necessitated the accurate detection and evaluation of HER2 expression in tumors. Traditionally, HER2 protein levels are routinely assessed by IHC in clinical practice, which not only requires significant time consumption and financial investment but is also technically challenging for many basic hospitals in developing countries. Therefore, directly predicting HER2 expression by hematoxylin-eosin (HE) staining should be of significant clinical values, and machine learning may be a potent technology to achieve this goal.</p><p><strong>Methods: </strong>In this study, we developed an artificial intelligence (AI) classification model using whole slide image of HE-stained slides to automatically assess HER2 status.</p><p><strong>Results: </strong>A publicly available TCGA-BRCA dataset and an in-house USTC-BC dataset were applied to evaluate our AI model and the state-of-the-art method SlideGraph + in terms of accuracy (ACC), the area under the receiver operating characteristic curve (AUC), and F1 score. Overall, our AI model achieved the superior performance in HER2 scoring in both datasets with AUC of 0.795 ± 0.028 and 0.688 ± 0.008 on the USCT-BC and TCGA-BRCA datasets, respectively. In addition, we visualized the results generated from our AI model by attention heatmaps, which proved that our AI model had strong interpretability.</p><p><strong>Conclusion: </strong>Our AI model is able to directly predict HER2 expression through HE images with strong interpretability, and has a better ACC particularly in HER2-low breast cancers, which provides a method for AI evaluation of HER2 status and helps to perform HER2 evaluation economically and efficiently. It has the potential to assist pathologists to improve diagnosis and assess biomarkers for companion diagnostics.</p>","PeriodicalId":49227,"journal":{"name":"Breast Cancer Research","volume":"27 1","pages":"57"},"PeriodicalIF":7.4000,"publicationDate":"2025-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12008878/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Breast Cancer Research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s13058-025-01998-8","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Medicine","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Treatment with HER2-targeted therapies is recommended for HER2-positive breast cancer patients with HER2 gene amplification or protein overexpression. Interestingly, recent clinical trials of novel HER2-targeted therapies demonstrated promising efficacy in HER2-low breast cancers, raising the prospect of including a HER2-low category (immunohistochemistry, IHC) score of 1 + or 2 + with non-amplified in-situ hybridization for HER2-targeted treatments, which necessitated the accurate detection and evaluation of HER2 expression in tumors. Traditionally, HER2 protein levels are routinely assessed by IHC in clinical practice, which not only requires significant time consumption and financial investment but is also technically challenging for many basic hospitals in developing countries. Therefore, directly predicting HER2 expression by hematoxylin-eosin (HE) staining should be of significant clinical values, and machine learning may be a potent technology to achieve this goal.
Methods: In this study, we developed an artificial intelligence (AI) classification model using whole slide image of HE-stained slides to automatically assess HER2 status.
Results: A publicly available TCGA-BRCA dataset and an in-house USTC-BC dataset were applied to evaluate our AI model and the state-of-the-art method SlideGraph + in terms of accuracy (ACC), the area under the receiver operating characteristic curve (AUC), and F1 score. Overall, our AI model achieved the superior performance in HER2 scoring in both datasets with AUC of 0.795 ± 0.028 and 0.688 ± 0.008 on the USCT-BC and TCGA-BRCA datasets, respectively. In addition, we visualized the results generated from our AI model by attention heatmaps, which proved that our AI model had strong interpretability.
Conclusion: Our AI model is able to directly predict HER2 expression through HE images with strong interpretability, and has a better ACC particularly in HER2-low breast cancers, which provides a method for AI evaluation of HER2 status and helps to perform HER2 evaluation economically and efficiently. It has the potential to assist pathologists to improve diagnosis and assess biomarkers for companion diagnostics.
期刊介绍:
Breast Cancer Research, an international, peer-reviewed online journal, publishes original research, reviews, editorials, and reports. It features open-access research articles of exceptional interest across all areas of biology and medicine relevant to breast cancer. This includes normal mammary gland biology, with a special emphasis on the genetic, biochemical, and cellular basis of breast cancer. In addition to basic research, the journal covers preclinical, translational, and clinical studies with a biological basis, including Phase I and Phase II trials.