Inter-reader reproducibility of a radiographic grading system for usual interstitial pneumonitis validates its use as a surrogate endpoint in clinical trials
{"title":"Inter-reader reproducibility of a radiographic grading system for usual interstitial pneumonitis validates its use as a surrogate endpoint in clinical trials","authors":"K. Capaccione, Hong Ma, L. Luk, Mary M. Salvatore","doi":"10.1556/1647.2024.00190","DOIUrl":null,"url":null,"abstract":"The primary purpose of this study was to assess the interreader reliability of a grading system for UIP based on the quantification of normal lung. This grading system considers each of the following lung regions: right upper and middle lobes, right lower lobe, left upper lobe, and left lower lobe. Each is assigned a grade based on the following: 0: 0% normal lung; 1: 1–49% normal lung; 2: 50–74% normal lung; 3: 75–89% normal lung; 4: 90–99% normal lung; 5: 100% normal lung. The secondary purpose was to compare the grades rendered by non-cardiothoracic subspecialty trained radiologists to grades established by cardiothoracic radiologists, which were considered the gold standard.Chest CT images of patients were obtained by searching the radiology record system for the terms “usual interstitial pneumonia” and “UIP”. Each case was confirmed by radiologist review; pathology was not assessed given the small fraction of cases that underwent biopsy due to the high risk of complications in patients with fibrotic lung disease. Two cardiothoracic radiologists evaluated each CT and reached a consensus grade. Two different radiologists who were not subspecialty trained in cardiothoracic radiology independently graded each case. Spearman correlation analysis was performed to compare the two reader's grades as well as each reader's grade independently to the gold standard score.Our analysis demonstrated a strongly positive statistically significant interreader correlation coefficient (RS) = 0.7062, P < 0.001. Our analysis of each reader compared to the gold standard demonstrated an Rs = 0.77559, P < 0.001 and an RS = 0.69958, P < 0.001 for readers 1 and 2, respectively, both representing statistically significant strongly positive correlations.These results demonstrate strong interreader reproducibility and show that radiologists without subspecialty training in cardiothoracic radiology render grades that correlate strongly with those given by cardiothoracic radiologists. These findings support the use of this grading system for UIP both to monitor clinical progression and as a surrogate endpoint for antifibrotic drug trials.","PeriodicalId":503851,"journal":{"name":"Imaging","volume":"31 16","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Imaging","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1556/1647.2024.00190","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The primary purpose of this study was to assess the interreader reliability of a grading system for UIP based on the quantification of normal lung. This grading system considers each of the following lung regions: right upper and middle lobes, right lower lobe, left upper lobe, and left lower lobe. Each is assigned a grade based on the following: 0: 0% normal lung; 1: 1–49% normal lung; 2: 50–74% normal lung; 3: 75–89% normal lung; 4: 90–99% normal lung; 5: 100% normal lung. The secondary purpose was to compare the grades rendered by non-cardiothoracic subspecialty trained radiologists to grades established by cardiothoracic radiologists, which were considered the gold standard.Chest CT images of patients were obtained by searching the radiology record system for the terms “usual interstitial pneumonia” and “UIP”. Each case was confirmed by radiologist review; pathology was not assessed given the small fraction of cases that underwent biopsy due to the high risk of complications in patients with fibrotic lung disease. Two cardiothoracic radiologists evaluated each CT and reached a consensus grade. Two different radiologists who were not subspecialty trained in cardiothoracic radiology independently graded each case. Spearman correlation analysis was performed to compare the two reader's grades as well as each reader's grade independently to the gold standard score.Our analysis demonstrated a strongly positive statistically significant interreader correlation coefficient (RS) = 0.7062, P < 0.001. Our analysis of each reader compared to the gold standard demonstrated an Rs = 0.77559, P < 0.001 and an RS = 0.69958, P < 0.001 for readers 1 and 2, respectively, both representing statistically significant strongly positive correlations.These results demonstrate strong interreader reproducibility and show that radiologists without subspecialty training in cardiothoracic radiology render grades that correlate strongly with those given by cardiothoracic radiologists. These findings support the use of this grading system for UIP both to monitor clinical progression and as a surrogate endpoint for antifibrotic drug trials.