利用三维傅立叶变换与全多尺度特征进行腹腔镜立体匹配

IF 7.5 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS
Renkai Wu , Pengchen Liang , Yinghao Liu , Yiqi Huang , Wangyan Li , Qing Chang
{"title":"利用三维傅立叶变换与全多尺度特征进行腹腔镜立体匹配","authors":"Renkai Wu ,&nbsp;Pengchen Liang ,&nbsp;Yinghao Liu ,&nbsp;Yiqi Huang ,&nbsp;Wangyan Li ,&nbsp;Qing Chang","doi":"10.1016/j.engappai.2024.109654","DOIUrl":null,"url":null,"abstract":"<div><div><strong>3</strong>-<strong>D</strong>imensional (3D) reconstruction of laparoscopic surgical scenes is a key task for future surgical navigation and automated robotic minimally invasive surgery. Binocular laparoscopy with stereo matching enables 3D reconstruction. Stereo matching models used for natural images such as autopilot tend to be less suitable for laparoscopic environments due to the constraints of small samples of laparoscopic images, complex textures, and uneven illumination. In addition, current stereo matching modules use 3D convolutions and transformers in the spatial domain as the base module, which is limited by the ability to learn in the spatial domain. In this paper, we propose a model for laparoscopic stereo matching using 3D <strong>F</strong>ourier <strong>T</strong>ransform combined with <strong>F</strong>ull <strong>M</strong>ulti-scale <strong>F</strong>eatures (FT-FMF Net). Specifically, the proposed <strong>F</strong>ull <strong>M</strong>ulti-scale <strong>F</strong>usion <strong>M</strong>odule (FMFM) is able to fuse the full multi-scale feature information from the feature extractor into the stereo matching block, which densely learns the feature information with parallax and FMFM fusion information in the frequency domain using the proposed <strong>D</strong>ense <strong>F</strong>ourier <strong>T</strong>ransform <strong>M</strong>odule (DFTM). We validated the proposed method in both the laparoscopic dataset (SCARED) and the endoscopic dataset (SERV-CT). In comparison with other popular and advanced deep learning models available at present, FT-FMF Net achieves the most advanced stereo matching performance available. In the SCARED and SERV-CT public datasets, the End-Point-Error (EPE) was 0.7265 and 2.3119, and the <strong>R</strong>oot <strong>M</strong>ean <strong>S</strong>quare <strong>E</strong>rror Depth (RMSE Depth) was 4.00 mm and 3.69 mm, respectively. In addition, the inference time is only 0.17s. Our project code is available on <span><span>https://github.com/wurenkai/FT-FMF</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"139 ","pages":"Article 109654"},"PeriodicalIF":7.5000,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Laparoscopic stereo matching using 3-Dimensional Fourier transform with full multi-scale features\",\"authors\":\"Renkai Wu ,&nbsp;Pengchen Liang ,&nbsp;Yinghao Liu ,&nbsp;Yiqi Huang ,&nbsp;Wangyan Li ,&nbsp;Qing Chang\",\"doi\":\"10.1016/j.engappai.2024.109654\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div><strong>3</strong>-<strong>D</strong>imensional (3D) reconstruction of laparoscopic surgical scenes is a key task for future surgical navigation and automated robotic minimally invasive surgery. Binocular laparoscopy with stereo matching enables 3D reconstruction. Stereo matching models used for natural images such as autopilot tend to be less suitable for laparoscopic environments due to the constraints of small samples of laparoscopic images, complex textures, and uneven illumination. In addition, current stereo matching modules use 3D convolutions and transformers in the spatial domain as the base module, which is limited by the ability to learn in the spatial domain. In this paper, we propose a model for laparoscopic stereo matching using 3D <strong>F</strong>ourier <strong>T</strong>ransform combined with <strong>F</strong>ull <strong>M</strong>ulti-scale <strong>F</strong>eatures (FT-FMF Net). Specifically, the proposed <strong>F</strong>ull <strong>M</strong>ulti-scale <strong>F</strong>usion <strong>M</strong>odule (FMFM) is able to fuse the full multi-scale feature information from the feature extractor into the stereo matching block, which densely learns the feature information with parallax and FMFM fusion information in the frequency domain using the proposed <strong>D</strong>ense <strong>F</strong>ourier <strong>T</strong>ransform <strong>M</strong>odule (DFTM). We validated the proposed method in both the laparoscopic dataset (SCARED) and the endoscopic dataset (SERV-CT). In comparison with other popular and advanced deep learning models available at present, FT-FMF Net achieves the most advanced stereo matching performance available. In the SCARED and SERV-CT public datasets, the End-Point-Error (EPE) was 0.7265 and 2.3119, and the <strong>R</strong>oot <strong>M</strong>ean <strong>S</strong>quare <strong>E</strong>rror Depth (RMSE Depth) was 4.00 mm and 3.69 mm, respectively. In addition, the inference time is only 0.17s. Our project code is available on <span><span>https://github.com/wurenkai/FT-FMF</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":50523,\"journal\":{\"name\":\"Engineering Applications of Artificial Intelligence\",\"volume\":\"139 \",\"pages\":\"Article 109654\"},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2024-11-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Engineering Applications of Artificial Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0952197624018128\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering Applications of Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0952197624018128","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0

摘要

腹腔镜手术场景的三维(3D)重建是未来手术导航和自动机器人微创手术的关键任务。带有立体匹配功能的双目腹腔镜可实现三维重建。由于腹腔镜图像样本小、纹理复杂、光照不均等限制,用于自动驾驶等自然图像的立体匹配模型往往不太适合腹腔镜环境。此外,目前的立体匹配模块使用空间域的三维卷积和变换器作为基础模块,这就限制了空间域的学习能力。在本文中,我们提出了一种使用三维傅立叶变换结合全多尺度特征(FT-FMF Net)的腹腔镜立体匹配模型。具体来说,所提出的全多尺度融合模块(FMFM)能够将特征提取器中的全多尺度特征信息融合到立体匹配模块中,该模块利用所提出的密集傅立叶变换模块(DFTM)在频域中密集学习带有视差和 FMFM 融合信息的特征信息。我们在腹腔镜数据集(SCARED)和内窥镜数据集(SERV-CT)中验证了所提出的方法。与目前其他流行的高级深度学习模型相比,FT-FMF Net 实现了目前最先进的立体匹配性能。在 SCARED 和 SERV-CT 公共数据集中,端点误差(EPE)分别为 0.7265 和 2.3119,均方根误差深度(RMSE Depth)分别为 4.00 毫米和 3.69 毫米。此外,推理时间仅为 0.17 秒。我们的项目代码可在 https://github.com/wurenkai/FT-FMF 上获取。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Laparoscopic stereo matching using 3-Dimensional Fourier transform with full multi-scale features
3-Dimensional (3D) reconstruction of laparoscopic surgical scenes is a key task for future surgical navigation and automated robotic minimally invasive surgery. Binocular laparoscopy with stereo matching enables 3D reconstruction. Stereo matching models used for natural images such as autopilot tend to be less suitable for laparoscopic environments due to the constraints of small samples of laparoscopic images, complex textures, and uneven illumination. In addition, current stereo matching modules use 3D convolutions and transformers in the spatial domain as the base module, which is limited by the ability to learn in the spatial domain. In this paper, we propose a model for laparoscopic stereo matching using 3D Fourier Transform combined with Full Multi-scale Features (FT-FMF Net). Specifically, the proposed Full Multi-scale Fusion Module (FMFM) is able to fuse the full multi-scale feature information from the feature extractor into the stereo matching block, which densely learns the feature information with parallax and FMFM fusion information in the frequency domain using the proposed Dense Fourier Transform Module (DFTM). We validated the proposed method in both the laparoscopic dataset (SCARED) and the endoscopic dataset (SERV-CT). In comparison with other popular and advanced deep learning models available at present, FT-FMF Net achieves the most advanced stereo matching performance available. In the SCARED and SERV-CT public datasets, the End-Point-Error (EPE) was 0.7265 and 2.3119, and the Root Mean Square Error Depth (RMSE Depth) was 4.00 mm and 3.69 mm, respectively. In addition, the inference time is only 0.17s. Our project code is available on https://github.com/wurenkai/FT-FMF.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Engineering Applications of Artificial Intelligence
Engineering Applications of Artificial Intelligence 工程技术-工程:电子与电气
CiteScore
9.60
自引率
10.00%
发文量
505
审稿时长
68 days
期刊介绍: Artificial Intelligence (AI) is pivotal in driving the fourth industrial revolution, witnessing remarkable advancements across various machine learning methodologies. AI techniques have become indispensable tools for practicing engineers, enabling them to tackle previously insurmountable challenges. Engineering Applications of Artificial Intelligence serves as a global platform for the swift dissemination of research elucidating the practical application of AI methods across all engineering disciplines. Submitted papers are expected to present novel aspects of AI utilized in real-world engineering applications, validated using publicly available datasets to ensure the replicability of research outcomes. Join us in exploring the transformative potential of AI in engineering.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信