Interrater agreement and variability in visual reading of [18F] flutemetamol PET images.

IF 2.5 4区医学 Q2 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING

Annals of Nuclear Medicine Pub Date : 2024-09-24 DOI:10.1007/s12149-024-01977-7

Akinori Takenaka, Takashi Nihashi, Keita Sakurai, Keiji Notomi, Hokuto Ono, Yoshitaka Inui, Shinji Ito, Yutaka Arahata, Akinori Takeda, Kazunari Ishii, Kenji Ishii, Kengo Ito, Hiroshi Toyama, Akinori Nakamura, Takashi Kato

{"title":"Interrater agreement and variability in visual reading of [18F] flutemetamol PET images.","authors":"Akinori Takenaka, Takashi Nihashi, Keita Sakurai, Keiji Notomi, Hokuto Ono, Yoshitaka Inui, Shinji Ito, Yutaka Arahata, Akinori Takeda, Kazunari Ishii, Kenji Ishii, Kengo Ito, Hiroshi Toyama, Akinori Nakamura, Takashi Kato","doi":"10.1007/s12149-024-01977-7","DOIUrl":null,"url":null,"abstract":"Objective: The purpose of this study was to validate the concordance of visual ratings of [18F] flutemetamol amyloid positron emission tomography (PET) images and to investigate the correlation between the agreement of each rater and the Centiloid (CL) scale.Methods: A total of 192 participants, clinically classified as cognitively normal (CN) (n = 59), mild cognitive impairment (MCI) (n = 65), Alzheimer's disease (AD) (n = 55), or non-AD dementia (n = 13), participated in this study. Three experts conducted visual ratings of the amyloid PET images for all 192 patients, assigning a confidence level to each rating on a three-point scale (certain, probable, or neither). The positive or negative determination of amyloid PET results was made by majority vote. The CL value was calculated using the CapAIBL pipeline.Results: Overall, 101 images were determined to be positive, and 91 images were negative. Of the 101 positive images, the three raters were in complete agreement for 92 images and in disagreement for 9 images. Of the 91 negative images, the three raters were in complete agreement for 75 images and in disagreement for 16 images. Interrater reliability among the three experts was particularly high, with both Fleiss' kappa and Conger's kappa measuring 0.83 (0.76-0.89). The CL values of the unanimous positive group were significantly greater than those of the other groups, whereas the CL values of the unanimous negative group were significantly lower than those of the other groups. Images with rater disagreement had intermediate CLs. In cases with a high confidence level, the positive or negative visual ratings were in almost complete agreement. However, as confidence levels decreased, experts' visual ratings became more variable. The lower the confidence level was, the greater the number of cases with disagreement in the visual ratings.Conclusion: Three experts independently rated 192 amyloid PET images, achieving a high level of interrater agreement. However, in patients with intermediate amyloid accumulation, visual ratings varied. Therefore, determining positive and negative decisions in these patients should be performed with caution.","PeriodicalId":8007,"journal":{"name":"Annals of Nuclear Medicine","volume":" ","pages":""},"PeriodicalIF":2.5000,"publicationDate":"2024-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annals of Nuclear Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s12149-024-01977-7","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}

引用次数: 0

Abstract

Objective: The purpose of this study was to validate the concordance of visual ratings of [18F] flutemetamol amyloid positron emission tomography (PET) images and to investigate the correlation between the agreement of each rater and the Centiloid (CL) scale.

Methods: A total of 192 participants, clinically classified as cognitively normal (CN) (n = 59), mild cognitive impairment (MCI) (n = 65), Alzheimer's disease (AD) (n = 55), or non-AD dementia (n = 13), participated in this study. Three experts conducted visual ratings of the amyloid PET images for all 192 patients, assigning a confidence level to each rating on a three-point scale (certain, probable, or neither). The positive or negative determination of amyloid PET results was made by majority vote. The CL value was calculated using the CapAIBL pipeline.

Results: Overall, 101 images were determined to be positive, and 91 images were negative. Of the 101 positive images, the three raters were in complete agreement for 92 images and in disagreement for 9 images. Of the 91 negative images, the three raters were in complete agreement for 75 images and in disagreement for 16 images. Interrater reliability among the three experts was particularly high, with both Fleiss' kappa and Conger's kappa measuring 0.83 (0.76-0.89). The CL values of the unanimous positive group were significantly greater than those of the other groups, whereas the CL values of the unanimous negative group were significantly lower than those of the other groups. Images with rater disagreement had intermediate CLs. In cases with a high confidence level, the positive or negative visual ratings were in almost complete agreement. However, as confidence levels decreased, experts' visual ratings became more variable. The lower the confidence level was, the greater the number of cases with disagreement in the visual ratings.

Conclusion: Three experts independently rated 192 amyloid PET images, achieving a high level of interrater agreement. However, in patients with intermediate amyloid accumulation, visual ratings varied. Therefore, determining positive and negative decisions in these patients should be performed with caution.

查看原文本刊更多论文

[18F]氟替美托 PET 图像视觉阅读的互译一致性和可变性。

研究目的本研究的目的是验证[18F] 氟替美托淀粉样蛋白正电子发射断层扫描（PET）图像视觉评分的一致性，并研究每位评分者的一致性与Centiloid（CL）量表之间的相关性：共有 192 名参与者参与了这项研究，他们在临床上被归类为认知正常（CN）（n = 59）、轻度认知障碍（MCI）（n = 65）、阿尔茨海默病（AD）（n = 55）或非 AD 痴呆（n = 13）。三位专家对所有 192 名患者的淀粉样蛋白 PET 图像进行了目测评分，并对每个评分按三点评分法（确定、可能或都不是）给出了置信度。淀粉样蛋白 PET 结果的阳性或阴性判定由多数票决定。CL值使用CapAIBL管道计算：总体而言，101 张图像被确定为阳性，91 张图像为阴性。在 101 张阳性图像中，三位评分员完全一致的有 92 张，不一致的有 9 张。在 91 张阴性图像中，三位评分员完全一致的有 75 张，不一致的有 16 张。三位专家之间的相互信度特别高，弗莱斯卡帕和康格卡帕均为 0.83（0.76-0.89）。一致肯定组的 CL 值明显高于其他组，而一致否定组的 CL 值明显低于其他组。评分者意见不一的图像的 CL 值介于两者之间。在置信度较高的情况下，正面或负面的视觉评分几乎完全一致。然而，随着置信度的降低，专家的视觉评分也变得更加多变。置信度越低，视觉评级不一致的案例数量越多：结论：三位专家对 192 张淀粉样蛋白 PET 图像进行了独立评分，达到了较高的互评一致水平。然而，在淀粉样蛋白中度积聚的患者中，目测评分存在差异。因此，在确定这些患者的阳性和阴性判定时应谨慎。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Annals of Nuclear Medicine 医学-核医学

CiteScore

4.90

自引率

7.70%

发文量

111

审稿时长

4-8 weeks

期刊介绍： Annals of Nuclear Medicine is an official journal of the Japanese Society of Nuclear Medicine. It develops the appropriate application of radioactive substances and stable nuclides in the field of medicine. The journal promotes the exchange of ideas and information and research in nuclear medicine and includes the medical application of radionuclides and related subjects. It presents original articles, short communications, reviews and letters to the editor.