UltrasOM：一个基于曼巴的网络，用于使用光流进行三维手绘超声重建

IF 4.9 2区医学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Computer methods and programs in biomedicine Pub Date : 2025-05-10 DOI:10.1016/j.cmpb.2025.108843

Rui Sun , Chuanba Liu , Wenshuo Wang , Yimin Song , Tao Sun

{"title":"UltrasOM：一个基于曼巴的网络，用于使用光流进行三维手绘超声重建","authors":"Rui Sun , Chuanba Liu , Wenshuo Wang , Yimin Song , Tao Sun","doi":"10.1016/j.cmpb.2025.108843","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>Three-dimensional (3D) ultrasound (US) reconstruction is of significant value in clinical diagnosis, characterized by its safety, portability, low cost, and high real-time capabilities. 3D freehand ultrasound reconstruction aims to eliminate the need for tracking devices, relying solely on image data to infer the spatial relationships between frames. However, inherent jitter during handheld scanning introduces significant inaccuracies, making current methods ineffective in precisely predicting the spatial motions of ultrasound image frames. This leads to substantial cumulative errors over long sequence modeling, resulting in deformations or artifacts in the reconstructed volume. To address these challenges, we proposed UltrasOM, a 3D ultrasound reconstruction network designed for spatial relative motion estimation.</div></div><div><h3>Methods</h3><div>Initially, we designed a video embedding module that integrates optical flow dynamics with original static information to enhance motion change features between frames. Next, we developed a Mamba-based spatiotemporal attention module, utilizing multi-layer stacked Space-Time Blocks to effectively capture global spatiotemporal correlations within video frame sequences. Finally, we incorporated correlation loss and motion speed loss to prevent overfitting related to scanning speed and pose, enhancing the model's generalization capability.</div></div><div><h3>Results</h3><div>Experimental results on a dataset of 200 forearm cases, comprising 58,011 frames, demonstrated that the proposed method achieved a final drift rate (FDR) of 10.24 %, a frame-to-frame distance error (DE) of 7.34 mm, a symmetric Hausdorff distance error (HD) of 10.81 mm, and a mean angular error (MEA) of 2.05°, outperforming state-of-the-art methods by 13.24 %, 15.11 %, 3.57 %, and 6.32 %, respectively.</div></div><div><h3>Conclusion</h3><div>By integrating optical flow features and deeply exploring contextual spatiotemporal dependencies, the proposed network can directly predict the relative motions between multiple frames of ultrasound images without the need for tracking, surpassing the accuracy of existing methods.</div></div>","PeriodicalId":10624,"journal":{"name":"Computer methods and programs in biomedicine","volume":"268 ","pages":"Article 108843"},"PeriodicalIF":4.9000,"publicationDate":"2025-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"UltrasOM: A mamba-based network for 3D freehand ultrasound reconstruction using optical flow\",\"authors\":\"Rui Sun , Chuanba Liu , Wenshuo Wang , Yimin Song , Tao Sun\",\"doi\":\"10.1016/j.cmpb.2025.108843\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Background</h3><div>Three-dimensional (3D) ultrasound (US) reconstruction is of significant value in clinical diagnosis, characterized by its safety, portability, low cost, and high real-time capabilities. 3D freehand ultrasound reconstruction aims to eliminate the need for tracking devices, relying solely on image data to infer the spatial relationships between frames. However, inherent jitter during handheld scanning introduces significant inaccuracies, making current methods ineffective in precisely predicting the spatial motions of ultrasound image frames. This leads to substantial cumulative errors over long sequence modeling, resulting in deformations or artifacts in the reconstructed volume. To address these challenges, we proposed UltrasOM, a 3D ultrasound reconstruction network designed for spatial relative motion estimation.</div></div><div><h3>Methods</h3><div>Initially, we designed a video embedding module that integrates optical flow dynamics with original static information to enhance motion change features between frames. Next, we developed a Mamba-based spatiotemporal attention module, utilizing multi-layer stacked Space-Time Blocks to effectively capture global spatiotemporal correlations within video frame sequences. Finally, we incorporated correlation loss and motion speed loss to prevent overfitting related to scanning speed and pose, enhancing the model's generalization capability.</div></div><div><h3>Results</h3><div>Experimental results on a dataset of 200 forearm cases, comprising 58,011 frames, demonstrated that the proposed method achieved a final drift rate (FDR) of 10.24 %, a frame-to-frame distance error (DE) of 7.34 mm, a symmetric Hausdorff distance error (HD) of 10.81 mm, and a mean angular error (MEA) of 2.05°, outperforming state-of-the-art methods by 13.24 %, 15.11 %, 3.57 %, and 6.32 %, respectively.</div></div><div><h3>Conclusion</h3><div>By integrating optical flow features and deeply exploring contextual spatiotemporal dependencies, the proposed network can directly predict the relative motions between multiple frames of ultrasound images without the need for tracking, surpassing the accuracy of existing methods.</div></div>\",\"PeriodicalId\":10624,\"journal\":{\"name\":\"Computer methods and programs in biomedicine\",\"volume\":\"268 \",\"pages\":\"Article 108843\"},\"PeriodicalIF\":4.9000,\"publicationDate\":\"2025-05-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer methods and programs in biomedicine\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0169260725002603\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer methods and programs in biomedicine","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169260725002603","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

摘要

三维超声（3D）重建具有安全、便携、成本低、实时性高等特点，在临床诊断中具有重要价值。三维手绘超声重建旨在消除对跟踪设备的需求，仅依靠图像数据推断帧之间的空间关系。然而，手持扫描过程中固有的抖动引入了显著的不准确性，使当前的方法无法精确预测超声图像帧的空间运动。这将导致在长序列建模过程中大量累积误差，从而导致重构体中的变形或伪影。为了解决这些挑战，我们提出了UltrasOM，一个用于空间相对运动估计的三维超声重建网络。方法首先，我们设计了一个视频嵌入模块，将光流动力学与原始静态信息相结合，增强帧间的运动变化特征。接下来，我们开发了一个基于mamba的时空注意力模块，利用多层堆叠的时空块来有效捕获视频帧序列中的全局时空相关性。最后，我们引入了相关损失和运动速度损失，以防止扫描速度和姿态相关的过拟合，增强了模型的泛化能力。结果在200个前臂案例（58,011帧）数据集上的实验结果表明，该方法的最终漂移率（FDR）为10.24%，帧间距离误差（DE）为7.34 mm，对称Hausdorff距离误差（HD）为10.81 mm，平均角度误差（MEA）为2.05°，分别优于现有方法13.24%,15.11%，3.57%和6.32%。结论通过整合光流特征，深入挖掘上下文时空依赖关系，所提出的网络可以直接预测多帧超声图像之间的相对运动，而无需跟踪，超越了现有方法的准确性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

UltrasOM: A mamba-based network for 3D freehand ultrasound reconstruction using optical flow

Background

Three-dimensional (3D) ultrasound (US) reconstruction is of significant value in clinical diagnosis, characterized by its safety, portability, low cost, and high real-time capabilities. 3D freehand ultrasound reconstruction aims to eliminate the need for tracking devices, relying solely on image data to infer the spatial relationships between frames. However, inherent jitter during handheld scanning introduces significant inaccuracies, making current methods ineffective in precisely predicting the spatial motions of ultrasound image frames. This leads to substantial cumulative errors over long sequence modeling, resulting in deformations or artifacts in the reconstructed volume. To address these challenges, we proposed UltrasOM, a 3D ultrasound reconstruction network designed for spatial relative motion estimation.

Methods

Initially, we designed a video embedding module that integrates optical flow dynamics with original static information to enhance motion change features between frames. Next, we developed a Mamba-based spatiotemporal attention module, utilizing multi-layer stacked Space-Time Blocks to effectively capture global spatiotemporal correlations within video frame sequences. Finally, we incorporated correlation loss and motion speed loss to prevent overfitting related to scanning speed and pose, enhancing the model's generalization capability.

Results

Experimental results on a dataset of 200 forearm cases, comprising 58,011 frames, demonstrated that the proposed method achieved a final drift rate (FDR) of 10.24 %, a frame-to-frame distance error (DE) of 7.34 mm, a symmetric Hausdorff distance error (HD) of 10.81 mm, and a mean angular error (MEA) of 2.05°, outperforming state-of-the-art methods by 13.24 %, 15.11 %, 3.57 %, and 6.32 %, respectively.

Conclusion

By integrating optical flow features and deeply exploring contextual spatiotemporal dependencies, the proposed network can directly predict the relative motions between multiple frames of ultrasound images without the need for tracking, surpassing the accuracy of existing methods.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computer methods and programs in biomedicine 工程技术-工程：生物医学

CiteScore

12.30

自引率

6.60%

发文量

601

审稿时长

135 days

期刊介绍： To encourage the development of formal computing methods, and their application in biomedical research and medical practice, by illustration of fundamental principles in biomedical informatics research; to stimulate basic research into application software design; to report the state of research of biomedical information processing projects; to report new computer methodologies applied in biomedical areas; the eventual distribution of demonstrable software to avoid duplication of effort; to provide a forum for discussion and improvement of existing software; to optimize contact between national organizations and regional user groups by promoting an international exchange of information on formal methods, standards and software in biomedicine. Computer Methods and Programs in Biomedicine covers computing methodology and software systems derived from computing science for implementation in all aspects of biomedical research and medical practice. It is designed to serve: biochemists; biologists; geneticists; immunologists; neuroscientists; pharmacologists; toxicologists; clinicians; epidemiologists; psychiatrists; psychologists; cardiologists; chemists; (radio)physicists; computer scientists; programmers and systems analysts; biomedical, clinical, electrical and other engineers; teachers of medical informatics and users of educational software.