Inter-rater agreement of HER2-low scores between expert breast pathologists and the Visiopharm digital image analysis application (HER2 APP, CE2797)

IF 3.7 2区医学 Q1 PATHOLOGY

Journal of Pathology Clinical Research Pub Date : 2025-10-16 DOI:10.1002/2056-4538.70051

Suzanne Parry, Lila Zabaglo, Abeer M Shaaban, Andrew Dodson

{"title":"Inter-rater agreement of HER2-low scores between expert breast pathologists and the Visiopharm digital image analysis application (HER2 APP, CE2797)","authors":"Suzanne Parry, Lila Zabaglo, Abeer M Shaaban, Andrew Dodson","doi":"10.1002/2056-4538.70051","DOIUrl":null,"url":null,"abstract":"Inter-observer concordance data for the HER2 category as assessed by a group of 16 specialist breast pathologists on 50 diagnostic core biopsies was compared with that produced by digital image analysis (DIA) using the HER2 APP, CE2797 (VP APP; Visiopharm, Hoersholm, Denmark). Comparing pathologists' consensus scores and DIA scores, 36 cases (73.5%) agreed. Fleiss' kappa statistic was 0.433 (indicative of moderate agreement). Cohen's weighted kappa was used to compare the scores of individual raters to consensus scores; for all 50 cases the kappa scores had a range between 0.412 and 0.854; the VP APP was ranked 12th of 17 raters (kappa score 0.638 indicating substantial agreement). Results for HER2-low cases (N = 44) showed a kappa score range of 0.295 to 0.823; the VP APP ranked 12th of 17 (score 0.535 indicating moderate agreement). For high agreement cases the kappa score range was 0.664 to 1.000 for all HER2 scores (N = 24) and the VP APP scored 0.916 (indicating almost perfect agreement). For the HER2-low scores (N = 20), the kappa score range was 0.506–1.000 and the VP APP scored 0.860 (almost perfect agreement). DIA of the proportions of tumour cells showing expression within each of the HER2 categories demonstrated that the majority of cases showing a low level of agreement between pathologists showed heterogeneity and/or a level of expression close to a cut-point for decision making. This study demonstrates that the VP APP produces results that are extremely well-aligned to those of expert pathologists in cases with good overall agreement, and in difficult cases its reproducibility will outperform that of the visual scorer. The results also suggest that use of the VP APP has the potential to reduce the proportion of cases referred for gene amplification testing by reducing the number of cases incorrectly classified as HER2 2+.","PeriodicalId":48612,"journal":{"name":"Journal of Pathology Clinical Research","volume":"11 6","pages":""},"PeriodicalIF":3.7000,"publicationDate":"2025-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pathsocjournals.onlinelibrary.wiley.com/doi/epdf/10.1002/2056-4538.70051","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Pathology Clinical Research","FirstCategoryId":"3","ListUrlMain":"https://pathsocjournals.onlinelibrary.wiley.com/doi/10.1002/2056-4538.70051","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PATHOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Inter-observer concordance data for the HER2 category as assessed by a group of 16 specialist breast pathologists on 50 diagnostic core biopsies was compared with that produced by digital image analysis (DIA) using the HER2 APP, CE2797 (VP APP; Visiopharm, Hoersholm, Denmark). Comparing pathologists' consensus scores and DIA scores, 36 cases (73.5%) agreed. Fleiss' kappa statistic was 0.433 (indicative of moderate agreement). Cohen's weighted kappa was used to compare the scores of individual raters to consensus scores; for all 50 cases the kappa scores had a range between 0.412 and 0.854; the VP APP was ranked 12th of 17 raters (kappa score 0.638 indicating substantial agreement). Results for HER2-low cases (N = 44) showed a kappa score range of 0.295 to 0.823; the VP APP ranked 12th of 17 (score 0.535 indicating moderate agreement). For high agreement cases the kappa score range was 0.664 to 1.000 for all HER2 scores (N = 24) and the VP APP scored 0.916 (indicating almost perfect agreement). For the HER2-low scores (N = 20), the kappa score range was 0.506–1.000 and the VP APP scored 0.860 (almost perfect agreement). DIA of the proportions of tumour cells showing expression within each of the HER2 categories demonstrated that the majority of cases showing a low level of agreement between pathologists showed heterogeneity and/or a level of expression close to a cut-point for decision making. This study demonstrates that the VP APP produces results that are extremely well-aligned to those of expert pathologists in cases with good overall agreement, and in difficult cases its reproducibility will outperform that of the visual scorer. The results also suggest that use of the VP APP has the potential to reduce the proportion of cases referred for gene amplification testing by reducing the number of cases incorrectly classified as HER2 2+.

Abstract Image

查看原文本刊更多论文

乳腺病理学专家与Visiopharm数字图像分析应用程序（HER2 APP, CE2797）之间HER2低评分的评分一致性。

由16名乳腺病理学专家对50例诊断性核心活检评估的HER2类别的观察者间一致性数据与使用HER2 APP， CE2797 （VP APP; Visiopharm, Hoersholm，丹麦）的数字图像分析（DIA）产生的数据进行比较。病理医师共识评分与DIA评分比较，36例（73.5%）一致。Fleiss’kappa统计量为0.433（表示中度一致）。科恩加权kappa被用来比较个人评分者的得分与共识得分；所有50例的kappa评分在0.412 ~ 0.854之间；VP APP在17个评分者中排名第12位（kappa得分0.638，表示基本一致）。44例her2低的患者kappa评分范围为0.295 ~ 0.823；VP APP在17个应用程序中排名第12位（得分0.535，表示中等同意）。对于高一致性病例，所有HER2评分的kappa评分范围为0.664 ~ 1.000 (N = 24)， VP APP评分为0.916（表明几乎完全一致）。对于HER2-low评分（N = 20）， kappa评分范围为0.506-1.000，VP APP评分为0.860（几乎完全一致）。在每个HER2类别中表现出表达的肿瘤细胞比例的DIA表明，大多数病例在病理学家之间表现出低水平的一致性，表现出异质性和/或表达水平接近决策的临界点。这项研究表明，VP APP产生的结果与病理学专家的结果非常一致，在总体一致的情况下，在困难的情况下，其再现性将优于视觉评分者。结果还表明，使用VP APP有可能通过减少被错误分类为HER2 2+的病例数量来减少转介进行基因扩增检测的病例比例。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Pathology Clinical Research Medicine-Pathology and Forensic Medicine

CiteScore

7.40

自引率

2.40%

发文量

审稿时长

20 weeks

期刊介绍： The Journal of Pathology: Clinical Research and The Journal of Pathology serve as translational bridges between basic biomedical science and clinical medicine with particular emphasis on, but not restricted to, tissue based studies. The focus of The Journal of Pathology: Clinical Research is the publication of studies that illuminate the clinical relevance of research in the broad area of the study of disease. Appropriately powered and validated studies with novel diagnostic, prognostic and predictive significance, and biomarker discover and validation, will be welcomed. Studies with a predominantly mechanistic basis will be more appropriate for the companion Journal of Pathology.