ViViEchoformer: Deep Video Regressor Predicting Ejection Fraction.

Journal of imaging informatics in medicine Pub Date : 2024-11-25 DOI:10.1007/s10278-024-01336-y

Taymaz Akan, Sait Alp, Md Shenuarin Bhuiyan, Tarek Helmy, A Wayne Orr, Md Mostafizur Rahman Bhuiyan, Steven A Conrad, John A Vanchiere, Christopher G Kevil, Mohammad Alfrad Nobel Bhuiyan

{"title":"ViViEchoformer: Deep Video Regressor Predicting Ejection Fraction.","authors":"Taymaz Akan, Sait Alp, Md Shenuarin Bhuiyan, Tarek Helmy, A Wayne Orr, Md Mostafizur Rahman Bhuiyan, Steven A Conrad, John A Vanchiere, Christopher G Kevil, Mohammad Alfrad Nobel Bhuiyan","doi":"10.1007/s10278-024-01336-y","DOIUrl":null,"url":null,"abstract":"<p><p>Heart disease is the leading cause of death worldwide, and cardiac function as measured by ejection fraction (EF) is an important determinant of outcomes, making accurate measurement a critical parameter in PT evaluation. Echocardiograms are commonly used for measuring EF, but human interpretation has limitations in terms of intra- and inter-observer (or reader) variance. Deep learning (DL) has driven a resurgence in machine learning, leading to advancements in medical applications. We introduce the ViViEchoformer DL approach, which uses a video vision transformer to directly regress the left ventricular function (LVEF) from echocardiogram videos. The study used a dataset of 10,030 apical-4-chamber echocardiography videos from patients at Stanford University Hospital. The model accurately captures spatial information and preserves inter-frame relationships by extracting spatiotemporal tokens from video input, allowing for accurate, fully automatic EF predictions that aid human assessment and analysis. The ViViEchoformer's prediction of ejection fraction has a mean absolute error of 6.14%, a root mean squared error of 8.4%, a mean squared log error of 0.04, and an <math> <msup><mrow><mi>R</mi></mrow> <mn>2</mn></msup> </math> of 0.55. ViViEchoformer predicted heart failure with reduced ejection fraction (HFrEF) with an area under the curve of 0.83 and a classification accuracy of 87 using a standard threshold of less than 50% ejection fraction. Our video-based method provides precise left ventricular function quantification, offering a reliable alternative to human evaluation and establishing a fundamental basis for echocardiogram interpretation.</p>","PeriodicalId":516858,"journal":{"name":"Journal of imaging informatics in medicine","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of imaging informatics in medicine","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s10278-024-01336-y","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Heart disease is the leading cause of death worldwide, and cardiac function as measured by ejection fraction (EF) is an important determinant of outcomes, making accurate measurement a critical parameter in PT evaluation. Echocardiograms are commonly used for measuring EF, but human interpretation has limitations in terms of intra- and inter-observer (or reader) variance. Deep learning (DL) has driven a resurgence in machine learning, leading to advancements in medical applications. We introduce the ViViEchoformer DL approach, which uses a video vision transformer to directly regress the left ventricular function (LVEF) from echocardiogram videos. The study used a dataset of 10,030 apical-4-chamber echocardiography videos from patients at Stanford University Hospital. The model accurately captures spatial information and preserves inter-frame relationships by extracting spatiotemporal tokens from video input, allowing for accurate, fully automatic EF predictions that aid human assessment and analysis. The ViViEchoformer's prediction of ejection fraction has a mean absolute error of 6.14%, a root mean squared error of 8.4%, a mean squared log error of 0.04, and an $R^{2}$ of 0.55. ViViEchoformer predicted heart failure with reduced ejection fraction (HFrEF) with an area under the curve of 0.83 and a classification accuracy of 87 using a standard threshold of less than 50% ejection fraction. Our video-based method provides precise left ventricular function quantification, offering a reliable alternative to human evaluation and establishing a fundamental basis for echocardiogram interpretation.

查看原文本刊更多论文

ViViEchoformer：预测射血分数的深度视频调节器

心脏病是导致全球死亡的主要原因，而以射血分数（EF）衡量的心脏功能是影响预后的重要决定因素，因此精确测量是 PT 评估的关键参数。超声心动图通常用于测量射血分数，但人工解读存在观察者（或读者）内部和观察者之间的差异。深度学习（DL）推动了机器学习的复苏，从而促进了医疗应用的发展。我们介绍了 ViViEchoformer DL 方法，它使用视频视觉转换器直接回归超声心动图视频中的左心室功能（LVEF）。研究使用了斯坦福大学医院患者的 10030 个心尖四腔超声心动图视频数据集。该模型通过从视频输入中提取时空标记，准确捕捉空间信息并保留帧间关系，从而实现准确、全自动的 EF 预测，为人工评估和分析提供帮助。ViViEchoformer 预测射血分数的平均绝对误差为 6.14%，平均平方根误差为 8.4%，平均平方对数误差为 0.04，R 2 为 0.55。ViViEchoformer 预测射血分数降低型心力衰竭（HFrEF）的曲线下面积为 0.83，以射血分数低于 50％为标准阈值，分类准确率为 87。我们基于视频的方法能精确量化左心室功能，为人工评估提供了可靠的替代方案，并为超声心动图解读奠定了基础。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of imaging informatics in medicine

自引率

0.00%

发文量