Development of a Deep Learning model Tailored for HER2 Detection in Breast Cancer to aid pathologists in interpreting HER2-Low cases

Pierre-Antoine Bannier, Glenn Broeckx, Loic Herpin, Remy Dubois, Lydwine Van Praet, Charles Maussion, Frederik Deman, Ellen Amonoo, Anca Mera, Jasmine Timbres, Cheryl Gillett, Elinor Sawyer, Patrycja Gazinska, Piotr Ziolkowski, Magali Lacroix-Triki, Roberto Salgado, Sheeba Irshad
{"title":"Development of a Deep Learning model Tailored for HER2 Detection in Breast Cancer to aid pathologists in interpreting HER2-Low cases","authors":"Pierre-Antoine Bannier, Glenn Broeckx, Loic Herpin, Remy Dubois, Lydwine Van Praet, Charles Maussion, Frederik Deman, Ellen Amonoo, Anca Mera, Jasmine Timbres, Cheryl Gillett, Elinor Sawyer, Patrycja Gazinska, Piotr Ziolkowski, Magali Lacroix-Triki, Roberto Salgado, Sheeba Irshad","doi":"10.1101/2024.07.01.601397","DOIUrl":null,"url":null,"abstract":"Introduction. Over 50% of breast cancer cases are Human epidermal growth factor receptor 2 (HER2) low breast cancer (BC), characterized by HER2 immunohistochemistry (IHC) scores of 1+ or 2+ alongside no amplification on fluorescence in situ hybridization (FISH) testing. The development of new anti-HER2 antibody-drug conjugates (ADCs) for treating HER2-low breast cancers illustrates the importance of accurately assessing HER2 status, particularly HER2-low breast cancer. In this study, we evaluated the performance of a deep learning (DL) model for the assessment of HER2, including an assessment of the causes of discordances of HER2-Null between a pathologist and the DL model. We specifically focussed on aligning the DL model rules with the ASCO/CAP guidelines, including stained cells staining intensity and completeness of membrane staining. Methods. We trained a DL model on a multi-centric cohort of breast cancer cases with HER2-immunohistochemistry scores (n=299). The model was validated on 2 independent multi-centric validation cohorts (n=369 and n=92), with all cases reviewed by 3 senior breast pathologists. All cases underwent a thorough review by three senior breast pathologists, with the ground truth determined by a majority consensus on the final HER2 score among the pathologists. In total, 760 breast cancer cases were utilized throughout the training and validation phases of the study.\nResults. The model concordance with the ground truth (ICC = 0.77 [0.68 - 0.83]; Fisher P = 1.32e-10) is higher than the average agreement among the 3 senior pathologists (ICC = 0.45 [0.17 - 0.65]; Fisher P = 2e-3). In the two validation cohorts, the DL model identifies 95% [93% - 98%] and 97% [91% - 100%] of HER2-low and HER2-positive tumors respectively. Discordant results were characterized by morphological features such as extended fibrosis, a high number of tumor-infiltrating lymphocytes, and necrosis, whilst some artifacts such as non-specific background cytoplasmic stain in the cytoplasm of tumor cells also cause discrepancy.\nConclusion: Deep learning can support pathologists' interpretation of difficult HER2-low cases. Morphological variables and some specific artifacts can cause discrepant HER2-scores between the pathologist and the DL Model.","PeriodicalId":501471,"journal":{"name":"bioRxiv - Pathology","volume":"136 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"bioRxiv - Pathology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.07.01.601397","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Introduction. Over 50% of breast cancer cases are Human epidermal growth factor receptor 2 (HER2) low breast cancer (BC), characterized by HER2 immunohistochemistry (IHC) scores of 1+ or 2+ alongside no amplification on fluorescence in situ hybridization (FISH) testing. The development of new anti-HER2 antibody-drug conjugates (ADCs) for treating HER2-low breast cancers illustrates the importance of accurately assessing HER2 status, particularly HER2-low breast cancer. In this study, we evaluated the performance of a deep learning (DL) model for the assessment of HER2, including an assessment of the causes of discordances of HER2-Null between a pathologist and the DL model. We specifically focussed on aligning the DL model rules with the ASCO/CAP guidelines, including stained cells staining intensity and completeness of membrane staining. Methods. We trained a DL model on a multi-centric cohort of breast cancer cases with HER2-immunohistochemistry scores (n=299). The model was validated on 2 independent multi-centric validation cohorts (n=369 and n=92), with all cases reviewed by 3 senior breast pathologists. All cases underwent a thorough review by three senior breast pathologists, with the ground truth determined by a majority consensus on the final HER2 score among the pathologists. In total, 760 breast cancer cases were utilized throughout the training and validation phases of the study. Results. The model concordance with the ground truth (ICC = 0.77 [0.68 - 0.83]; Fisher P = 1.32e-10) is higher than the average agreement among the 3 senior pathologists (ICC = 0.45 [0.17 - 0.65]; Fisher P = 2e-3). In the two validation cohorts, the DL model identifies 95% [93% - 98%] and 97% [91% - 100%] of HER2-low and HER2-positive tumors respectively. Discordant results were characterized by morphological features such as extended fibrosis, a high number of tumor-infiltrating lymphocytes, and necrosis, whilst some artifacts such as non-specific background cytoplasmic stain in the cytoplasm of tumor cells also cause discrepancy. Conclusion: Deep learning can support pathologists' interpretation of difficult HER2-low cases. Morphological variables and some specific artifacts can cause discrepant HER2-scores between the pathologist and the DL Model.
开发专为乳腺癌 HER2 检测定制的深度学习模型,帮助病理学家解释 HER2 低的病例
导言。50%以上的乳腺癌病例属于人类表皮生长因子受体2(HER2)低度乳腺癌(BC),其特征是HER2免疫组化(IHC)评分为1+或2+,同时荧光原位杂交(FISH)检测无扩增。用于治疗 HER2 低水平乳腺癌的新型抗 HER2 抗体-药物共轭物 (ADC) 的开发说明了准确评估 HER2 状态的重要性,尤其是 HER2 低水平乳腺癌。在这项研究中,我们评估了用于评估 HER2 的深度学习(DL)模型的性能,包括评估病理学家和 DL 模型之间 HER2-Null 不一致的原因。我们特别关注使 DL 模型规则与 ASCO/CAP 指南保持一致,包括染色细胞的染色强度和膜染色的完整性。方法。我们在具有 HER2-免疫组化评分的多中心乳腺癌病例队列(n=299)中训练了 DL 模型。该模型在 2 个独立的多中心验证队列(n=369 和 n=92)中进行了验证,所有病例均由 3 位资深乳腺病理学家审查。所有病例均由 3 位资深乳腺病理学家进行全面审查,并根据病理学家对最终 HER2 评分达成的多数共识确定基本事实。在整个研究的训练和验证阶段,共使用了 760 个乳腺癌病例。模型与基本事实的一致性(ICC = 0.77 [0.68 - 0.83]; Fisher P = 1.32e-10)高于 3 位资深病理学家的平均一致性(ICC = 0.45 [0.17 - 0.65]; Fisher P = 2e-3)。在两个验证队列中,DL 模型对 HER2 低和 HER2 阳性肿瘤的识别率分别为 95% [93% - 98%] 和 97% [91% - 100%]。不一致的结果以形态学特征为特征,如纤维化扩展、肿瘤浸润淋巴细胞数量多和坏死,而一些伪影,如肿瘤细胞胞浆中的非特异性背景胞浆染色也会造成差异:结论:深度学习可帮助病理学家解读HER2低的疑难病例。形态学变量和一些特定伪影会导致病理学家和深度学习模型之间的 HER2 评分出现差异。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信