Toward Accurate Deep Learning-Based Prediction of Ki67, ER, PR, and HER2 Status From H&E-Stained Breast Cancer Images.

IF 1.3 4区 医学 Q3 ANATOMY & MORPHOLOGY
Amir Akbarnejad, Nilanjan Ray, Penny J Barnes, Gilbert Bigras
{"title":"Toward Accurate Deep Learning-Based Prediction of Ki67, ER, PR, and HER2 Status From H&E-Stained Breast Cancer Images.","authors":"Amir Akbarnejad, Nilanjan Ray, Penny J Barnes, Gilbert Bigras","doi":"10.1097/PAI.0000000000001258","DOIUrl":null,"url":null,"abstract":"<p><p>Despite improvements in machine learning algorithms applied to digital pathology, only moderate accuracy, to predict molecular information from histology alone, has been achieved so far. One of the obstacles is the lack of large data sets to properly train machine learning models. We therefore built a data set of 185,538 breast cancer (BC) including hematoxylin and eosin (H&E) and associated immunohistochemistry (IHC) images of the proliferative marker Ki67, estrogen receptor (ER), progesterone receptor (PR), and the human epidermal growth factor receptor 2 (HER2). Optimal registration of H&E and IHC pairs was achieved. Ki67, ER, and PR IHC labels, to be predicted, were extracted from IHC assays using image analysis. These labels were ordinaly classified with incremental thresholds (cumulative logit models with balanced and partial proportional odds). HER2 label was determined as follows: positive if tumor IHC 3+ pattern is identified and otherwise negative. Cases with IHC equivocal score (2+) were excluded. A vision transformer (ViT)-based pipeline, trained with this data set, achieved prediction performance of 90% in terms of area under the curve (AUC) of the receiver operating characteristic (ROC) curves. ViT outperformed the weakly supervised clustering-constrained attention multiple instance learning (CLAM) which was developed to automatically identify subregions of high diagnostic value in whole slide. As a first step to \"explain\" artificial intelligence (AI), we evaluated the ability of both classifiers to localize these high diagnostic value subregions by inspecting their respective \"attention\" heat-maps. Despite high ViT AUC-ROC results, heat-maps do not obviously match areas of high diagnostic value subregions; it might however provide direction for future work to improve AI attention within whole slide images. Our proposed data set is publicly available.</p>","PeriodicalId":48952,"journal":{"name":"Applied Immunohistochemistry & Molecular Morphology","volume":" ","pages":""},"PeriodicalIF":1.3000,"publicationDate":"2025-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Immunohistochemistry & Molecular Morphology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1097/PAI.0000000000001258","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ANATOMY & MORPHOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Despite improvements in machine learning algorithms applied to digital pathology, only moderate accuracy, to predict molecular information from histology alone, has been achieved so far. One of the obstacles is the lack of large data sets to properly train machine learning models. We therefore built a data set of 185,538 breast cancer (BC) including hematoxylin and eosin (H&E) and associated immunohistochemistry (IHC) images of the proliferative marker Ki67, estrogen receptor (ER), progesterone receptor (PR), and the human epidermal growth factor receptor 2 (HER2). Optimal registration of H&E and IHC pairs was achieved. Ki67, ER, and PR IHC labels, to be predicted, were extracted from IHC assays using image analysis. These labels were ordinaly classified with incremental thresholds (cumulative logit models with balanced and partial proportional odds). HER2 label was determined as follows: positive if tumor IHC 3+ pattern is identified and otherwise negative. Cases with IHC equivocal score (2+) were excluded. A vision transformer (ViT)-based pipeline, trained with this data set, achieved prediction performance of 90% in terms of area under the curve (AUC) of the receiver operating characteristic (ROC) curves. ViT outperformed the weakly supervised clustering-constrained attention multiple instance learning (CLAM) which was developed to automatically identify subregions of high diagnostic value in whole slide. As a first step to "explain" artificial intelligence (AI), we evaluated the ability of both classifiers to localize these high diagnostic value subregions by inspecting their respective "attention" heat-maps. Despite high ViT AUC-ROC results, heat-maps do not obviously match areas of high diagnostic value subregions; it might however provide direction for future work to improve AI attention within whole slide images. Our proposed data set is publicly available.

求助全文
约1分钟内获得全文 求助全文
来源期刊
Applied Immunohistochemistry & Molecular Morphology
Applied Immunohistochemistry & Molecular Morphology ANATOMY & MORPHOLOGY-MEDICAL LABORATORY TECHNOLOGY
CiteScore
3.20
自引率
0.00%
发文量
153
期刊介绍: ​Applied Immunohistochemistry & Molecular Morphology covers newly developed identification and detection technologies, and their applications in research and diagnosis for the applied immunohistochemist & molecular Morphologist. Official Journal of the International Society for Immunohistochemisty and Molecular Morphology​.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信