Desmoking of the Endoscopic Surgery Images Based on a Local-Global U-Shaped Transformer Model

IF 3.4 Q2 ENGINEERING, BIOMEDICAL
Wanqing Wang;Fucheng Liu;Jianxiong Hao;Xiangyang Yu;Bo Zhang;Chaoyang Shi
{"title":"Desmoking of the Endoscopic Surgery Images Based on a Local-Global U-Shaped Transformer Model","authors":"Wanqing Wang;Fucheng Liu;Jianxiong Hao;Xiangyang Yu;Bo Zhang;Chaoyang Shi","doi":"10.1109/TMRB.2024.3517139","DOIUrl":null,"url":null,"abstract":"In robot-assisted minimally invasive surgery (RMIS), the smoke generated by energy-based surgical instruments blurs and obstructs the endoscopic surgical field, which increases the difficulty and risk of robotic surgery. However, current desmoking research primarily focuses on natural weather conditions, with limited studies addressing desmoking techniques for endoscopic images. Furthermore, surgical smoke presents a notably intricate morphology, and research efforts aimed at uniform, non-uniform, thin, and dense smoke remain relatively limited. This work proposes a Local-Global U-Shaped Transformer Model (LGUformer) based on the U-Net and Transformer architectures to remove complex smoke from endoscopic images. By introducing a local-global multi-head self-attention mechanism and multi-scale depthwise convolution, the proposed model enhances the inference capability. An enhanced feature map fusion method improves the quality of reconstructed images. The improved modules enable efficient handling of variable smoke while generating superior-quality images. Through desmoking experiments on synthetic and real smoke images, the LGUformer model demonstrated superior performance compared with seven other desmoking models in terms of accuracy, clarity, absence of distortion, and robustness. A task-based surgical instrument segmentation experiment indicated the potential of this model as a pre-processing step in visual tasks. Finally, an ablation study was conducted to verify the advantages of the proposed modules.","PeriodicalId":73318,"journal":{"name":"IEEE transactions on medical robotics and bionics","volume":"7 1","pages":"254-265"},"PeriodicalIF":3.4000,"publicationDate":"2024-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on medical robotics and bionics","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10798614/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}
引用次数: 0

Abstract

In robot-assisted minimally invasive surgery (RMIS), the smoke generated by energy-based surgical instruments blurs and obstructs the endoscopic surgical field, which increases the difficulty and risk of robotic surgery. However, current desmoking research primarily focuses on natural weather conditions, with limited studies addressing desmoking techniques for endoscopic images. Furthermore, surgical smoke presents a notably intricate morphology, and research efforts aimed at uniform, non-uniform, thin, and dense smoke remain relatively limited. This work proposes a Local-Global U-Shaped Transformer Model (LGUformer) based on the U-Net and Transformer architectures to remove complex smoke from endoscopic images. By introducing a local-global multi-head self-attention mechanism and multi-scale depthwise convolution, the proposed model enhances the inference capability. An enhanced feature map fusion method improves the quality of reconstructed images. The improved modules enable efficient handling of variable smoke while generating superior-quality images. Through desmoking experiments on synthetic and real smoke images, the LGUformer model demonstrated superior performance compared with seven other desmoking models in terms of accuracy, clarity, absence of distortion, and robustness. A task-based surgical instrument segmentation experiment indicated the potential of this model as a pre-processing step in visual tasks. Finally, an ablation study was conducted to verify the advantages of the proposed modules.
在机器人辅助微创手术(RMIS)中,基于能量的手术器械产生的烟雾会模糊和阻碍内窥镜手术视野,从而增加机器人手术的难度和风险。然而,目前的除烟研究主要集中在自然天气条件下,针对内窥镜图像除烟技术的研究非常有限。此外,手术烟雾呈现出明显的复杂形态,针对均匀、非均匀、稀薄和浓密烟雾的研究仍然相对有限。本研究提出了一种基于 U-Net 和 Transformer 架构的局部-全局 U 形变换器模型(LGUformer),用于去除内窥镜图像中的复杂烟雾。通过引入局部-全局多头自关注机制和多尺度深度卷积,该模型增强了推理能力。增强型特征图融合方法提高了重建图像的质量。改进后的模块能够有效处理可变烟雾,同时生成高质量的图像。通过对合成和真实烟雾图像进行除烟实验,LGUformer 模型与其他七个除烟模型相比,在准确性、清晰度、无失真和鲁棒性方面都表现出了卓越的性能。基于任务的手术器械分割实验表明,该模型具有在视觉任务中作为预处理步骤的潜力。最后,还进行了一项消融研究,以验证拟议模块的优势。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
6.80
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信