Video Transformer for Segmentation of Echocardiography Images in Myocardial Strain Measurement.

Journal of imaging informatics in medicine Pub Date : 2025-09-17 DOI:10.1007/s10278-025-01682-5

Kuan-Chih Huang, Chang-En Lin, Donna Shu-Han Lin, Ting-Tse Lin, Cho-Kai Wu, Geng-Shi Jeng, Lian-Yu Lin, Lung-Chun Lin

{"title":"Video Transformer for Segmentation of Echocardiography Images in Myocardial Strain Measurement.","authors":"Kuan-Chih Huang, Chang-En Lin, Donna Shu-Han Lin, Ting-Tse Lin, Cho-Kai Wu, Geng-Shi Jeng, Lian-Yu Lin, Lung-Chun Lin","doi":"10.1007/s10278-025-01682-5","DOIUrl":null,"url":null,"abstract":"<p><p>The adoption of left ventricular global longitudinal strain (LVGLS) is still restricted by variability among various vendors and observers, despite advancements from tissue Doppler to speckle tracking imaging, machine learning, and, more recently, convolutional neural network (CNN)-based segmentation strain analysis. While CNNs have enabled fully automated strain measurement, they are inherently constrained by restricted receptive fields and a lack of temporal consistency. Transformer-based networks have emerged as a powerful alternative in medical imaging, offering enhanced global attention. Among these, the Video Swin Transformer (V-SwinT) architecture, with its 3D-shifted windows and locality inductive bias, is particularly well suited for ultrasound imaging, providing temporal consistency while optimizing computational efficiency. In this study, we propose the DTHR-SegStrain model based on a V-SwinT backbone. This model incorporates contour regression and utilizes an FCN-style multiscale feature fusion. As a result, it can generate accurate and temporally consistent left ventricle (LV) contours, allowing for direct calculation of myocardial strain without the need for conversion from segmentation to contours or any additional postprocessing. Compared to EchoNet-dynamic and Unity-GLS, DTHR-SegStrain showed greater efficiency, reliability, and validity in LVGLS measurements. Furthermore, the hybridization experiments assessed the interaction between segmentation models and strain algorithms, reinforcing that consistent segmentation contours over time can simplify strain calculations and decrease measurement variability. These findings emphasize the potential of V-SwinT-based frameworks to enhance the standardization and clinical applicability of LVGLS assessments.</p>","PeriodicalId":516858,"journal":{"name":"Journal of imaging informatics in medicine","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of imaging informatics in medicine","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s10278-025-01682-5","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

The adoption of left ventricular global longitudinal strain (LVGLS) is still restricted by variability among various vendors and observers, despite advancements from tissue Doppler to speckle tracking imaging, machine learning, and, more recently, convolutional neural network (CNN)-based segmentation strain analysis. While CNNs have enabled fully automated strain measurement, they are inherently constrained by restricted receptive fields and a lack of temporal consistency. Transformer-based networks have emerged as a powerful alternative in medical imaging, offering enhanced global attention. Among these, the Video Swin Transformer (V-SwinT) architecture, with its 3D-shifted windows and locality inductive bias, is particularly well suited for ultrasound imaging, providing temporal consistency while optimizing computational efficiency. In this study, we propose the DTHR-SegStrain model based on a V-SwinT backbone. This model incorporates contour regression and utilizes an FCN-style multiscale feature fusion. As a result, it can generate accurate and temporally consistent left ventricle (LV) contours, allowing for direct calculation of myocardial strain without the need for conversion from segmentation to contours or any additional postprocessing. Compared to EchoNet-dynamic and Unity-GLS, DTHR-SegStrain showed greater efficiency, reliability, and validity in LVGLS measurements. Furthermore, the hybridization experiments assessed the interaction between segmentation models and strain algorithms, reinforcing that consistent segmentation contours over time can simplify strain calculations and decrease measurement variability. These findings emphasize the potential of V-SwinT-based frameworks to enhance the standardization and clinical applicability of LVGLS assessments.

查看原文本刊更多论文

用于心肌应变测量超声心动图图像分割的视频转换器。

尽管从组织多普勒到斑点跟踪成像、机器学习，以及最近基于卷积神经网络（CNN）的分割应变分析取得了进展，但左心室整体纵向应变（LVGLS）的采用仍然受到不同供应商和观察者的差异的限制。虽然cnn已经实现了完全自动化的应变测量，但它们本身受到受限制的接受野和缺乏时间一致性的限制。基于变压器的网络已成为医学成像领域的一种强有力的替代方案，提供了更高的全球关注。其中，视频Swin变压器（v - swt）架构具有3d移位窗口和局部感应偏置，特别适合超声成像，在优化计算效率的同时提供时间一致性。在这项研究中，我们提出了基于v - swt主干网的dthrr - segstrain模型。该模型结合轮廓回归和fcn风格的多尺度特征融合。因此，它可以生成准确且时间一致的左心室（LV）轮廓，允许直接计算心肌应变，而无需从分割到轮廓的转换或任何额外的后处理。与EchoNet-dynamic和Unity-GLS相比，dthrr - segstrain在LVGLS测量中表现出更高的效率、信度和效度。此外，杂交实验评估了分割模型和应变算法之间的相互作用，强调随着时间的推移，一致的分割轮廓可以简化应变计算并降低测量变异性。这些发现强调了基于v - swt的框架在提高LVGLS评估的标准化和临床适用性方面的潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of imaging informatics in medicine

自引率

0.00%

发文量