Hanlin Cheng, Zhongqing Shi, Zhanru Qi, Xiaoxian Wang, Guanjun Guo, Aijuan Fang, Zhibin Jin, Chunjie Shan, Ruiyang Chen, Yue Du, Sunnan Qian, Shouhua Luo, Jing Yao
{"title":"基于深度学习的二维经胸超声心动图视频级视图分类。","authors":"Hanlin Cheng, Zhongqing Shi, Zhanru Qi, Xiaoxian Wang, Guanjun Guo, Aijuan Fang, Zhibin Jin, Chunjie Shan, Ruiyang Chen, Yue Du, Sunnan Qian, Shouhua Luo, Jing Yao","doi":"10.1088/2057-1976/adb493","DOIUrl":null,"url":null,"abstract":"<p><p>In recent years, deep learning (DL)-based automatic view classification of 2D transthoracic echocardiography (TTE) has demonstrated strong performance, but has not fully addressed key clinical requirements such as view coverage, classification accuracy, inference delay, and the need for thorough exploration of performance in real-world clinical settings. We proposed a clinical requirement-driven DL framework, TTESlowFast, for accurate and efficient video-level TTE view classification. This framework is based on the SlowFast architecture and incorporates both a sampling balance strategy and a data augmentation strategy to address class imbalance and the limited availability of labeled TTE videos, respectively. TTESlowFast achieved an overall accuracy of 0.9881, precision of 0.9870, recall of 0.9867, and F1 score of 0.9867 on the test set. After field deployment, the model's overall accuracy, precision, recall, and F1 score for view classification were 0.9607, 0.9586, 0.9499, and 0.9530, respectively. The inference time for processing a single TTE video was 105.0 ± 50.1 ms on a desktop GPU (NVIDIA RTX 3060) and 186.0 ± 5.2 ms on an edge computing device (Jetson Orin Nano), which basically meets the clinical demand for immediate processing following image acquisition. The TTESlowFast framework proposed in this study demonstrates effective performance in TTE view classification with low inference delay, making it well-suited for various medical scenarios and showing significant potential for practical application.</p>","PeriodicalId":8896,"journal":{"name":"Biomedical Physics & Engineering Express","volume":" ","pages":""},"PeriodicalIF":1.3000,"publicationDate":"2025-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Deep learning-based video-level view classification of two-dimensional transthoracic echocardiography.\",\"authors\":\"Hanlin Cheng, Zhongqing Shi, Zhanru Qi, Xiaoxian Wang, Guanjun Guo, Aijuan Fang, Zhibin Jin, Chunjie Shan, Ruiyang Chen, Yue Du, Sunnan Qian, Shouhua Luo, Jing Yao\",\"doi\":\"10.1088/2057-1976/adb493\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>In recent years, deep learning (DL)-based automatic view classification of 2D transthoracic echocardiography (TTE) has demonstrated strong performance, but has not fully addressed key clinical requirements such as view coverage, classification accuracy, inference delay, and the need for thorough exploration of performance in real-world clinical settings. We proposed a clinical requirement-driven DL framework, TTESlowFast, for accurate and efficient video-level TTE view classification. This framework is based on the SlowFast architecture and incorporates both a sampling balance strategy and a data augmentation strategy to address class imbalance and the limited availability of labeled TTE videos, respectively. TTESlowFast achieved an overall accuracy of 0.9881, precision of 0.9870, recall of 0.9867, and F1 score of 0.9867 on the test set. After field deployment, the model's overall accuracy, precision, recall, and F1 score for view classification were 0.9607, 0.9586, 0.9499, and 0.9530, respectively. The inference time for processing a single TTE video was 105.0 ± 50.1 ms on a desktop GPU (NVIDIA RTX 3060) and 186.0 ± 5.2 ms on an edge computing device (Jetson Orin Nano), which basically meets the clinical demand for immediate processing following image acquisition. The TTESlowFast framework proposed in this study demonstrates effective performance in TTE view classification with low inference delay, making it well-suited for various medical scenarios and showing significant potential for practical application.</p>\",\"PeriodicalId\":8896,\"journal\":{\"name\":\"Biomedical Physics & Engineering Express\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":1.3000,\"publicationDate\":\"2025-02-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Biomedical Physics & Engineering Express\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1088/2057-1976/adb493\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biomedical Physics & Engineering Express","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1088/2057-1976/adb493","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0
摘要
近年来,基于深度学习(DL)的二维经胸超声心动图(TTE)自动视图分类已经显示出强大的性能,但还没有完全满足关键的临床需求,如视图覆盖、分类精度、推理延迟,以及需要在现实世界的临床环境中深入探索性能。我们提出了一个临床需求驱动的深度学习框架TTESlowFast,用于准确高效的视频级别的深度学习视图分类。该框架基于SlowFast架构,并结合了采样平衡策略和数据增强策略,分别解决类别不平衡和标记TTE视频的有限可用性。TTESlowFast在测试集上的总体准确率为0.9881,精密度为0.9870,召回率为0.9867,F1分数为0.9867。经过现场部署,模型的整体准确率、精密度、召回率和视图分类F1得分分别为0.9607、0.9586、0.9499和0.9530。在桌面GPU (NVIDIA RTX 3060)上处理单个TTE视频的推理时间为(105.0±50.1)ms,在边缘计算设备(Jetson Orin Nano)上处理单个TTE视频的推理时间为(186.0±5.2)ms,基本满足临床对图像采集后立即处理的需求。本研究提出的TTESlowFast框架在TTE视图分类方面表现出较好的性能,推理延迟低,适合各种医疗场景,具有很大的实际应用潜力。
Deep learning-based video-level view classification of two-dimensional transthoracic echocardiography.
In recent years, deep learning (DL)-based automatic view classification of 2D transthoracic echocardiography (TTE) has demonstrated strong performance, but has not fully addressed key clinical requirements such as view coverage, classification accuracy, inference delay, and the need for thorough exploration of performance in real-world clinical settings. We proposed a clinical requirement-driven DL framework, TTESlowFast, for accurate and efficient video-level TTE view classification. This framework is based on the SlowFast architecture and incorporates both a sampling balance strategy and a data augmentation strategy to address class imbalance and the limited availability of labeled TTE videos, respectively. TTESlowFast achieved an overall accuracy of 0.9881, precision of 0.9870, recall of 0.9867, and F1 score of 0.9867 on the test set. After field deployment, the model's overall accuracy, precision, recall, and F1 score for view classification were 0.9607, 0.9586, 0.9499, and 0.9530, respectively. The inference time for processing a single TTE video was 105.0 ± 50.1 ms on a desktop GPU (NVIDIA RTX 3060) and 186.0 ± 5.2 ms on an edge computing device (Jetson Orin Nano), which basically meets the clinical demand for immediate processing following image acquisition. The TTESlowFast framework proposed in this study demonstrates effective performance in TTE view classification with low inference delay, making it well-suited for various medical scenarios and showing significant potential for practical application.
期刊介绍:
BPEX is an inclusive, international, multidisciplinary journal devoted to publishing new research on any application of physics and/or engineering in medicine and/or biology. Characterized by a broad geographical coverage and a fast-track peer-review process, relevant topics include all aspects of biophysics, medical physics and biomedical engineering. Papers that are almost entirely clinical or biological in their focus are not suitable. The journal has an emphasis on publishing interdisciplinary work and bringing research fields together, encompassing experimental, theoretical and computational work.