Samuel G Armato, Karen Drukker, Lubomir Hadjiiski, Carol C Wu, Jayashree Kalpathy-Cramer, George Shih, Maryellen L Giger, Natalie Baughan, Benjamin Bearce, Adam E Flanders, Robyn L Ball, Kyle J Myers, Heather M Whitney, The Midrc Grand Challenge Working Group
{"title":"MIDRC mRALE Mastermind Grand Challenge: AI to predict COVID severity on chest radiographs.","authors":"Samuel G Armato, Karen Drukker, Lubomir Hadjiiski, Carol C Wu, Jayashree Kalpathy-Cramer, George Shih, Maryellen L Giger, Natalie Baughan, Benjamin Bearce, Adam E Flanders, Robyn L Ball, Kyle J Myers, Heather M Whitney, The Midrc Grand Challenge Working Group","doi":"10.1117/1.JMI.12.2.024505","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>The Medical Imaging and Data Resource Center (MIDRC) mRALE Mastermind Grand Challenge fostered the development of artificial intelligence (AI) techniques for the automated assignment of mRALE (modified radiographic assessment of lung edema) scores to portable chest radiographs from patients known to have COVID-19.</p><p><strong>Approach: </strong>The challenge utilized 2079 training cases obtained from the publicly available MIDRC data commons, with validation and test cases sampled from not-yet-public MIDRC cases that were inaccessible to challenge participants. The reference standard mRALE scores for the challenge cases were established by a pool of 22 radiologist annotators. Using the MedICI challenge platform, participants submitted their trained algorithms encapsulated in Docker containers. Algorithms were evaluated by the challenge organizers on 814 test cases through two performance assessment metrics: quadratic-weighted kappa and prediction probability concordance.</p><p><strong>Results: </strong>Nine AI algorithms were submitted to the challenge for assessment against the test set cases. The algorithm that demonstrated the highest agreement with the reference standard had a quadratic-weighted kappa of 0.885 and a prediction probability concordance of 0.875. Substantial variability in mRALE scores assigned by the annotators and output by the AI algorithms was observed.</p><p><strong>Conclusions: </strong>The MIDRC mRALE Mastermind Grand Challenge revealed the potential of AI to assess COVID-19 severity from portable CXRs, demonstrating promising performance against the reference standard. The observed variability in mRALE scores highlights the challenges in standardizing severity assessment. These findings contribute to ongoing efforts to develop AI technologies for potential use in clinical practice and offer insights for the enhancement of COVID-19 severity assessment.</p>","PeriodicalId":47707,"journal":{"name":"Journal of Medical Imaging","volume":"12 2","pages":"024505"},"PeriodicalIF":1.7000,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12014941/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Medical Imaging","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1117/1.JMI.12.2.024505","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/4/18 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose: The Medical Imaging and Data Resource Center (MIDRC) mRALE Mastermind Grand Challenge fostered the development of artificial intelligence (AI) techniques for the automated assignment of mRALE (modified radiographic assessment of lung edema) scores to portable chest radiographs from patients known to have COVID-19.
Approach: The challenge utilized 2079 training cases obtained from the publicly available MIDRC data commons, with validation and test cases sampled from not-yet-public MIDRC cases that were inaccessible to challenge participants. The reference standard mRALE scores for the challenge cases were established by a pool of 22 radiologist annotators. Using the MedICI challenge platform, participants submitted their trained algorithms encapsulated in Docker containers. Algorithms were evaluated by the challenge organizers on 814 test cases through two performance assessment metrics: quadratic-weighted kappa and prediction probability concordance.
Results: Nine AI algorithms were submitted to the challenge for assessment against the test set cases. The algorithm that demonstrated the highest agreement with the reference standard had a quadratic-weighted kappa of 0.885 and a prediction probability concordance of 0.875. Substantial variability in mRALE scores assigned by the annotators and output by the AI algorithms was observed.
Conclusions: The MIDRC mRALE Mastermind Grand Challenge revealed the potential of AI to assess COVID-19 severity from portable CXRs, demonstrating promising performance against the reference standard. The observed variability in mRALE scores highlights the challenges in standardizing severity assessment. These findings contribute to ongoing efforts to develop AI technologies for potential use in clinical practice and offer insights for the enhancement of COVID-19 severity assessment.
期刊介绍:
JMI covers fundamental and translational research, as well as applications, focused on medical imaging, which continue to yield physical and biomedical advancements in the early detection, diagnostics, and therapy of disease as well as in the understanding of normal. The scope of JMI includes: Imaging physics, Tomographic reconstruction algorithms (such as those in CT and MRI), Image processing and deep learning, Computer-aided diagnosis and quantitative image analysis, Visualization and modeling, Picture archiving and communications systems (PACS), Image perception and observer performance, Technology assessment, Ultrasonic imaging, Image-guided procedures, Digital pathology, Biomedical applications of biomedical imaging. JMI allows for the peer-reviewed communication and archiving of scientific developments, translational and clinical applications, reviews, and recommendations for the field.