Jean-François Bergerot, Amandine Crombé, Mylène Seux, Basile Porta, Vanessa Fyon, Samuel Le Nivet, Nicolas Lippa, Rémi Peyre, Paul Etchart, Frédérique Gay, Guillaume Gorincour
{"title":"Development and assessment of the AE-RADS standardized grid for specifically evaluating adverse events in diagnostic radiology and teleradiology.","authors":"Jean-François Bergerot, Amandine Crombé, Mylène Seux, Basile Porta, Vanessa Fyon, Samuel Le Nivet, Nicolas Lippa, Rémi Peyre, Paul Etchart, Frédérique Gay, Guillaume Gorincour","doi":"10.1186/s12880-025-01670-9","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>A specific grid for analyzing and grading adverse events in diagnostic radiology is lacking. In France, the standard HAS grid, a generic 5-point scale adapted from the Common Terminology Criteria for Adverse Events (CTCAEs), is criticized for limited applicability in radiology. Our aim was to develop and evaluate a radiology-specific AE grid (AE-RADS) tailored to diagnostic and teleradiological practices and to compare its performance against the CTCAEs-based HAS grid regarding inter-observer reproducibility and agreement with expert consensus.</p><p><strong>Methods: </strong>AE-RADS, structured as a decision tree with 90 items, was developed by four senior radiologists with extensive AE experience. To assess it, 100 AE cases from early 2022 were reviewed by two radiologists and two non-physician support members, all blinded to the initial AE grading. Observers rated AEs using both the HAS and AE-RADS grids, comparing severity, AE frequency per patient, sources, and types for inter-observer reproducibility and expert agreement. Tests included intra-class correlation coefficient (ICC), Fleiss Kappa and Krippendorff alpha for reproducibility and McNemar test for comparing agreement with consensus.</p><p><strong>Results: </strong>Among 100 patients (49 women, median age 66.9 years), 104 AEs were identified. AE-RADS achieved higher inter-observer reproducibility for AE frequency (ICC = 0.690 vs. 0.642 with HAS) and for grading the most serious AE (Krippendorff alpha = 0.519 vs. 0.506 with HAS). Agreement with expert consensus was significantly greater with AE-RADS (63-81%) than with HAS (25-47%; P-value range: 0.0001-0.0051).</p><p><strong>Conclusion: </strong>AE-RADS shows improved, though still imperfect, agreement between evaluators and experts, supporting its potential for more precise AE assessment in diagnostic imaging.</p>","PeriodicalId":9020,"journal":{"name":"BMC Medical Imaging","volume":"25 1","pages":"143"},"PeriodicalIF":2.9000,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12046918/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medical Imaging","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12880-025-01670-9","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0
Abstract
Background: A specific grid for analyzing and grading adverse events in diagnostic radiology is lacking. In France, the standard HAS grid, a generic 5-point scale adapted from the Common Terminology Criteria for Adverse Events (CTCAEs), is criticized for limited applicability in radiology. Our aim was to develop and evaluate a radiology-specific AE grid (AE-RADS) tailored to diagnostic and teleradiological practices and to compare its performance against the CTCAEs-based HAS grid regarding inter-observer reproducibility and agreement with expert consensus.
Methods: AE-RADS, structured as a decision tree with 90 items, was developed by four senior radiologists with extensive AE experience. To assess it, 100 AE cases from early 2022 were reviewed by two radiologists and two non-physician support members, all blinded to the initial AE grading. Observers rated AEs using both the HAS and AE-RADS grids, comparing severity, AE frequency per patient, sources, and types for inter-observer reproducibility and expert agreement. Tests included intra-class correlation coefficient (ICC), Fleiss Kappa and Krippendorff alpha for reproducibility and McNemar test for comparing agreement with consensus.
Results: Among 100 patients (49 women, median age 66.9 years), 104 AEs were identified. AE-RADS achieved higher inter-observer reproducibility for AE frequency (ICC = 0.690 vs. 0.642 with HAS) and for grading the most serious AE (Krippendorff alpha = 0.519 vs. 0.506 with HAS). Agreement with expert consensus was significantly greater with AE-RADS (63-81%) than with HAS (25-47%; P-value range: 0.0001-0.0051).
Conclusion: AE-RADS shows improved, though still imperfect, agreement between evaluators and experts, supporting its potential for more precise AE assessment in diagnostic imaging.
期刊介绍:
BMC Medical Imaging is an open access journal publishing original peer-reviewed research articles in the development, evaluation, and use of imaging techniques and image processing tools to diagnose and manage disease.