Quality assurance of late gadolinium enhancement cardiac magnetic resonance images: a deep learning classifier for confidence in the presence or absence of abnormality with potential to prompt real-time image optimization.

IF 4.2 1区 医学 Q1 CARDIAC & CARDIOVASCULAR SYSTEMS
Sameer Zaman, Kavitha Vimalesvaran, Digby Chappell, Marta Varela, Nicholas S Peters, Hunain Shiwani, Kristopher D Knott, Rhodri H Davies, James C Moon, Anil A Bharath, Nick Wf Linton, Darrel P Francis, Graham D Cole, James P Howard
{"title":"Quality assurance of late gadolinium enhancement cardiac magnetic resonance images: a deep learning classifier for confidence in the presence or absence of abnormality with potential to prompt real-time image optimization.","authors":"Sameer Zaman, Kavitha Vimalesvaran, Digby Chappell, Marta Varela, Nicholas S Peters, Hunain Shiwani, Kristopher D Knott, Rhodri H Davies, James C Moon, Anil A Bharath, Nick Wf Linton, Darrel P Francis, Graham D Cole, James P Howard","doi":"10.1016/j.jocmr.2024.101040","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Late gadolinium enhancement (LGE) of the myocardium has significant diagnostic and prognostic implications, with even small areas of enhancement being important. Distinguishing between definitely normal and definitely abnormal LGE images is usually straightforward, but diagnostic uncertainty arises when reporters are not sure whether the observed LGE is genuine or not. This uncertainty might be resolved by repetition (to remove artifact) or further acquisition of intersecting images, but this must take place before the scan finishes. Real-time quality assurance by humans is a complex task requiring training and experience, so being able to identify which images have an intermediate likelihood of LGE while the scan is ongoing, without the presence of an expert is of high value. This decision-support could prompt immediate image optimization or acquisition of supplementary images to confirm or refute the presence of genuine LGE. This could reduce ambiguity in reports.</p><p><strong>Methods: </strong>Short-axis, phase-sensitive inversion recovery late gadolinium images were extracted from our clinical cardiac magnetic resonance (CMR) database and shuffled. Two, independent, blinded experts scored each individual slice for \"LGE likelihood\" on a visual analog scale, from 0 (absolute certainty of no LGE) to 100 (absolute certainty of LGE), with 50 representing clinical equipoise. The scored images were split into two classes-either \"high certainty\" of whether LGE was present or not, or \"low certainty.\" The dataset was split into training, validation, and test sets (70:15:15). A deep learning binary classifier based on the EfficientNetV2 convolutional neural network architecture was trained to distinguish between these categories. Classifier performance on the test set was evaluated by calculating the accuracy, precision, recall, F1-score, and area under the receiver operating characteristics curve (ROC AUC). Performance was also evaluated on an external test set of images from a different center.</p><p><strong>Results: </strong>One thousand six hundred and forty-five images (from 272 patients) were labeled and split at the patient level into training (1151 images), validation (247 images), and test (247 images) sets for the deep learning binary classifier. Of these, 1208 images were \"high certainty\" (255 for LGE, 953 for no LGE), and 437 were \"low certainty\". An external test comprising 247 images from 41 patients from another center was also employed. After 100 epochs, the performance on the internal test set was accuracy = 0.94, recall = 0.80, precision = 0.97, F1-score = 0.87, and ROC AUC = 0.94. The classifier also performed robustly on the external test set (accuracy = 0.91, recall = 0.73, precision = 0.93, F1-score = 0.82, and ROC AUC = 0.91). These results were benchmarked against a reference inter-expert accuracy of 0.86.</p><p><strong>Conclusion: </strong>Deep learning shows potential to automate quality control of late gadolinium imaging in CMR. The ability to identify short-axis images with intermediate LGE likelihood in real-time may serve as a useful decision-support tool. This approach has the potential to guide immediate further imaging while the patient is still in the scanner, thereby reducing the frequency of recalls and inconclusive reports due to diagnostic indecision.</p>","PeriodicalId":15221,"journal":{"name":"Journal of Cardiovascular Magnetic Resonance","volume":" ","pages":"101040"},"PeriodicalIF":4.2000,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11129090/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Cardiovascular Magnetic Resonance","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1016/j.jocmr.2024.101040","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/3/24 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"CARDIAC & CARDIOVASCULAR SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Late gadolinium enhancement (LGE) of the myocardium has significant diagnostic and prognostic implications, with even small areas of enhancement being important. Distinguishing between definitely normal and definitely abnormal LGE images is usually straightforward, but diagnostic uncertainty arises when reporters are not sure whether the observed LGE is genuine or not. This uncertainty might be resolved by repetition (to remove artifact) or further acquisition of intersecting images, but this must take place before the scan finishes. Real-time quality assurance by humans is a complex task requiring training and experience, so being able to identify which images have an intermediate likelihood of LGE while the scan is ongoing, without the presence of an expert is of high value. This decision-support could prompt immediate image optimization or acquisition of supplementary images to confirm or refute the presence of genuine LGE. This could reduce ambiguity in reports.

Methods: Short-axis, phase-sensitive inversion recovery late gadolinium images were extracted from our clinical cardiac magnetic resonance (CMR) database and shuffled. Two, independent, blinded experts scored each individual slice for "LGE likelihood" on a visual analog scale, from 0 (absolute certainty of no LGE) to 100 (absolute certainty of LGE), with 50 representing clinical equipoise. The scored images were split into two classes-either "high certainty" of whether LGE was present or not, or "low certainty." The dataset was split into training, validation, and test sets (70:15:15). A deep learning binary classifier based on the EfficientNetV2 convolutional neural network architecture was trained to distinguish between these categories. Classifier performance on the test set was evaluated by calculating the accuracy, precision, recall, F1-score, and area under the receiver operating characteristics curve (ROC AUC). Performance was also evaluated on an external test set of images from a different center.

Results: One thousand six hundred and forty-five images (from 272 patients) were labeled and split at the patient level into training (1151 images), validation (247 images), and test (247 images) sets for the deep learning binary classifier. Of these, 1208 images were "high certainty" (255 for LGE, 953 for no LGE), and 437 were "low certainty". An external test comprising 247 images from 41 patients from another center was also employed. After 100 epochs, the performance on the internal test set was accuracy = 0.94, recall = 0.80, precision = 0.97, F1-score = 0.87, and ROC AUC = 0.94. The classifier also performed robustly on the external test set (accuracy = 0.91, recall = 0.73, precision = 0.93, F1-score = 0.82, and ROC AUC = 0.91). These results were benchmarked against a reference inter-expert accuracy of 0.86.

Conclusion: Deep learning shows potential to automate quality control of late gadolinium imaging in CMR. The ability to identify short-axis images with intermediate LGE likelihood in real-time may serve as a useful decision-support tool. This approach has the potential to guide immediate further imaging while the patient is still in the scanner, thereby reducing the frequency of recalls and inconclusive reports due to diagnostic indecision.

晚期钆增强心脏磁共振成像的质量保证:深度学习分类器对异常存在与否的置信度,有望促进实时图像优化。
背景:心肌晚期钆增强(LGE)具有重要的诊断和预后意义,即使是小范围的增强也很重要。通常可以直接将 LGE 图像区分为绝对正常和绝对异常;但当记者不能确定观察到的 LGE 是否真实时,就会产生诊断上的不确定性。这种不确定性可以通过重复(去除伪影)或进一步采集交叉图像来解决,但这必须在扫描结束前进行。由人工进行实时质量保证是一项复杂的任务,需要培训和经验,因此在扫描过程中,在没有专家在场的情况下,能够识别哪些图像具有 LGE 的中等可能性具有很高的价值。这种决策支持可促使立即优化图像或获取补充图像,以确认或反驳是否存在真正的 LGE。这可以减少报告中的歧义:方法:从我们的临床 CMR 数据库中提取短轴、相位敏感反转恢复(PSIR)晚期钆图像并进行洗牌。两名独立的盲法专家采用视觉模拟评分法对每个切片的 "LGE 可能性 "进行评分,评分范围从 0(绝对确定无 LGE)到 100(绝对确定有 LGE),50 代表临床等值。评分图像分为两类--是否存在 LGE 的 "高确定性 "或 "低确定性"。数据集分为训练集、验证集和测试集(70:15:15)。训练基于 EfficientNetV2 卷积神经网络架构的深度学习二元分类器来区分这些类别。通过计算准确率、精确度、召回率、F1 分数和接收器工作特性曲线下面积(ROC AUC),评估了分类器在测试集上的性能。此外,还对来自不同中心的外部图像测试集进行了性能评估:为深度学习二元分类器标注了 1645 幅图像(来自 272 名患者),并在患者级别上将其分为训练集(1151 幅图像)、验证集(247 幅图像)和测试集(247 幅图像)。其中,1208 张图像为 "高确定性"(255 张为 LGE,953 张为无 LGE),437 张为 "低确定性")。还采用了外部测试,包括来自另一个中心 41 名患者的 247 幅图像。经过 100 次历时后,内部测试集的表现为:准确率 = 94%,召回率 = 0.80,精确度 = 0.97,F1-分数 = 0.87,ROC AUC = 0.94。分类器在外部测试集上的表现也很稳健(准确率 = 91%,召回率 = 0.73,精确度 = 0.93,F1 分数 = 0.82,ROC AUC = 0.91)。这些结果是以 86% 的专家间参考准确率为基准得出的:深度学习显示了在 CMR 中自动进行后期钆成像质量控制的潜力。实时识别具有中等 LGE 可能性的短轴图像的能力可作为有用的决策支持工具。这种方法有可能在患者仍在扫描仪中时立即指导进一步成像,从而减少因诊断犹豫不决而导致的召回和不确定报告的频率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
10.90
自引率
12.50%
发文量
61
审稿时长
6-12 weeks
期刊介绍: Journal of Cardiovascular Magnetic Resonance (JCMR) publishes high-quality articles on all aspects of basic, translational and clinical research on the design, development, manufacture, and evaluation of cardiovascular magnetic resonance (CMR) methods applied to the cardiovascular system. Topical areas include, but are not limited to: New applications of magnetic resonance to improve the diagnostic strategies, risk stratification, characterization and management of diseases affecting the cardiovascular system. New methods to enhance or accelerate image acquisition and data analysis. Results of multicenter, or larger single-center studies that provide insight into the utility of CMR. Basic biological perceptions derived by CMR methods.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信