Wanqing Wang;Fucheng Liu;Jianxiong Hao;Xiangyang Yu;Bo Zhang;Chaoyang Shi
{"title":"Desmoking of the Endoscopic Surgery Images Based on a Local-Global U-Shaped Transformer Model","authors":"Wanqing Wang;Fucheng Liu;Jianxiong Hao;Xiangyang Yu;Bo Zhang;Chaoyang Shi","doi":"10.1109/TMRB.2024.3517139","DOIUrl":null,"url":null,"abstract":"In robot-assisted minimally invasive surgery (RMIS), the smoke generated by energy-based surgical instruments blurs and obstructs the endoscopic surgical field, which increases the difficulty and risk of robotic surgery. However, current desmoking research primarily focuses on natural weather conditions, with limited studies addressing desmoking techniques for endoscopic images. Furthermore, surgical smoke presents a notably intricate morphology, and research efforts aimed at uniform, non-uniform, thin, and dense smoke remain relatively limited. This work proposes a Local-Global U-Shaped Transformer Model (LGUformer) based on the U-Net and Transformer architectures to remove complex smoke from endoscopic images. By introducing a local-global multi-head self-attention mechanism and multi-scale depthwise convolution, the proposed model enhances the inference capability. An enhanced feature map fusion method improves the quality of reconstructed images. The improved modules enable efficient handling of variable smoke while generating superior-quality images. Through desmoking experiments on synthetic and real smoke images, the LGUformer model demonstrated superior performance compared with seven other desmoking models in terms of accuracy, clarity, absence of distortion, and robustness. A task-based surgical instrument segmentation experiment indicated the potential of this model as a pre-processing step in visual tasks. Finally, an ablation study was conducted to verify the advantages of the proposed modules.","PeriodicalId":73318,"journal":{"name":"IEEE transactions on medical robotics and bionics","volume":"7 1","pages":"254-265"},"PeriodicalIF":3.4000,"publicationDate":"2024-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on medical robotics and bionics","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10798614/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}
引用次数: 0
Abstract
In robot-assisted minimally invasive surgery (RMIS), the smoke generated by energy-based surgical instruments blurs and obstructs the endoscopic surgical field, which increases the difficulty and risk of robotic surgery. However, current desmoking research primarily focuses on natural weather conditions, with limited studies addressing desmoking techniques for endoscopic images. Furthermore, surgical smoke presents a notably intricate morphology, and research efforts aimed at uniform, non-uniform, thin, and dense smoke remain relatively limited. This work proposes a Local-Global U-Shaped Transformer Model (LGUformer) based on the U-Net and Transformer architectures to remove complex smoke from endoscopic images. By introducing a local-global multi-head self-attention mechanism and multi-scale depthwise convolution, the proposed model enhances the inference capability. An enhanced feature map fusion method improves the quality of reconstructed images. The improved modules enable efficient handling of variable smoke while generating superior-quality images. Through desmoking experiments on synthetic and real smoke images, the LGUformer model demonstrated superior performance compared with seven other desmoking models in terms of accuracy, clarity, absence of distortion, and robustness. A task-based surgical instrument segmentation experiment indicated the potential of this model as a pre-processing step in visual tasks. Finally, an ablation study was conducted to verify the advantages of the proposed modules.