Evaluating Observer Reliability and Diagnostic Accuracy of CT-LEFAT Criteria for Post-Treatment Head and Neck Lymphedema: A Prospective Blinded Comparative Analysis of Oncologist Human Inter-Rater Performance
MD Anderson Head and Neck Cancer Symptom Working Group, Natalie A. West, Serageldin Kamel, Zaphanlene Kaffey, Cem Dede, Samuel L. Mulder, Dina M. El-Habashy, Roger Neuberger, Mohamed A Naser, Steven J. Frank, Shitong Mao, Holly McMillan, Brad Smith, David Rosenthal, Stephen Y. Lai, Katherine A. Hutcheson, Amy C Moreno, Clifton David Fuller
{"title":"Evaluating Observer Reliability and Diagnostic Accuracy of CT-LEFAT Criteria for Post-Treatment Head and Neck Lymphedema: A Prospective Blinded Comparative Analysis of Oncologist Human Inter-Rater Performance","authors":"MD Anderson Head and Neck Cancer Symptom Working Group, Natalie A. West, Serageldin Kamel, Zaphanlene Kaffey, Cem Dede, Samuel L. Mulder, Dina M. El-Habashy, Roger Neuberger, Mohamed A Naser, Steven J. Frank, Shitong Mao, Holly McMillan, Brad Smith, David Rosenthal, Stephen Y. Lai, Katherine A. Hutcheson, Amy C Moreno, Clifton David Fuller","doi":"10.1101/2024.09.17.24313809","DOIUrl":null,"url":null,"abstract":"Background\nRadiation-associated lymphedema and fibrosis (LEF) is a significant toxicity following radiation therapy (RT) for head and neck cancer (HNC) patients. Recently, the CT Lymphedema and Fibrosis Assessment Tool (CT-LEFAT) was developed to standardize LEF diagnosis through fat stranding visualized on CT. This study aims to evaluate the inter-observer reliability and diagnostic accuracy of the CT-LEFAT criteria.\nMaterials and Methods\nThis study retrospectively evaluated 26 HNC patients treated with RT that received a minimum of two contrast-enhanced CT scans. Qualitative review was conducted by five physician raters to assess the fat stranding observed on CT according to the CT-LEFAT criteria. Fleiss' kappa analysis was used to assess the inter- and intra-rater reliability, and Receiver Operating Characteristic (ROC) Area Under the Curve (AUC) analysis was used to evaluate diagnostic accuracy. Results\nThe inter-rater reliability across the six CT-LEFAT regions generally indicated a slight to fair agreement across all raters (0.04 ≤ kappa ≤ 0.36). Intra-observer agreement was generally fair to moderate (overall kappa=0.44). The ROC AUC analysis varied based on aggregation method used (0.60 ≤ average AUC ≤ 0.70).\nConclusion\nThis specific use-case evaluating CT-LEFAT criteria displays limited performance. This suggests that additional materials, such as further training, refinement of imaging methods, or other processes may be required before achieving clinically-ready diagnostic performance of LEF diagnosis.","PeriodicalId":501437,"journal":{"name":"medRxiv - Oncology","volume":"95 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"medRxiv - Oncology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.09.17.24313809","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Background
Radiation-associated lymphedema and fibrosis (LEF) is a significant toxicity following radiation therapy (RT) for head and neck cancer (HNC) patients. Recently, the CT Lymphedema and Fibrosis Assessment Tool (CT-LEFAT) was developed to standardize LEF diagnosis through fat stranding visualized on CT. This study aims to evaluate the inter-observer reliability and diagnostic accuracy of the CT-LEFAT criteria.
Materials and Methods
This study retrospectively evaluated 26 HNC patients treated with RT that received a minimum of two contrast-enhanced CT scans. Qualitative review was conducted by five physician raters to assess the fat stranding observed on CT according to the CT-LEFAT criteria. Fleiss' kappa analysis was used to assess the inter- and intra-rater reliability, and Receiver Operating Characteristic (ROC) Area Under the Curve (AUC) analysis was used to evaluate diagnostic accuracy. Results
The inter-rater reliability across the six CT-LEFAT regions generally indicated a slight to fair agreement across all raters (0.04 ≤ kappa ≤ 0.36). Intra-observer agreement was generally fair to moderate (overall kappa=0.44). The ROC AUC analysis varied based on aggregation method used (0.60 ≤ average AUC ≤ 0.70).
Conclusion
This specific use-case evaluating CT-LEFAT criteria displays limited performance. This suggests that additional materials, such as further training, refinement of imaging methods, or other processes may be required before achieving clinically-ready diagnostic performance of LEF diagnosis.