{"title":"SwinD-Net: a lightweight segmentation network for laparoscopic liver segmentation.","authors":"Shuiming Ouyang, Baochun He, Huoling Luo, Fucang Jia","doi":"10.1080/24699322.2024.2329675","DOIUrl":null,"url":null,"abstract":"<p><p>The real-time requirement for image segmentation in laparoscopic surgical assistance systems is extremely high. Although traditional deep learning models can ensure high segmentation accuracy, they suffer from a large computational burden. In the practical setting of most hospitals, where powerful computing resources are lacking, these models cannot meet the real-time computational demands. We propose a novel network SwinD-Net based on Skip connections, incorporating Depthwise separable convolutions and Swin Transformer Blocks. To reduce computational overhead, we eliminate the skip connection in the first layer and reduce the number of channels in shallow feature maps. Additionally, we introduce Swin Transformer Blocks, which have a larger computational and parameter footprint, to extract global information and capture high-level semantic features. Through these modifications, our network achieves desirable performance while maintaining a lightweight design. We conduct experiments on the CholecSeg8k dataset to validate the effectiveness of our approach. Compared to other models, our approach achieves high accuracy while significantly reducing computational and parameter overhead. Specifically, our model requires only 98.82 M floating-point operations (FLOPs) and 0.52 M parameters, with an inference time of 47.49 ms per image on a CPU. Compared to the recently proposed lightweight segmentation network UNeXt, our model not only outperforms it in terms of the Dice metric but also has only 1/3 of the parameters and 1/22 of the FLOPs. In addition, our model achieves a 2.4 times faster inference speed than UNeXt, demonstrating comprehensive improvements in both accuracy and speed. Our model effectively reduces parameter count and computational complexity, improving the inference speed while maintaining comparable accuracy. The source code will be available at https://github.com/ouyangshuiming/SwinDNet.</p>","PeriodicalId":56051,"journal":{"name":"Computer Assisted Surgery","volume":null,"pages":null},"PeriodicalIF":1.5000,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Assisted Surgery","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1080/24699322.2024.2329675","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/3/20 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"SURGERY","Score":null,"Total":0}
引用次数: 0
Abstract
The real-time requirement for image segmentation in laparoscopic surgical assistance systems is extremely high. Although traditional deep learning models can ensure high segmentation accuracy, they suffer from a large computational burden. In the practical setting of most hospitals, where powerful computing resources are lacking, these models cannot meet the real-time computational demands. We propose a novel network SwinD-Net based on Skip connections, incorporating Depthwise separable convolutions and Swin Transformer Blocks. To reduce computational overhead, we eliminate the skip connection in the first layer and reduce the number of channels in shallow feature maps. Additionally, we introduce Swin Transformer Blocks, which have a larger computational and parameter footprint, to extract global information and capture high-level semantic features. Through these modifications, our network achieves desirable performance while maintaining a lightweight design. We conduct experiments on the CholecSeg8k dataset to validate the effectiveness of our approach. Compared to other models, our approach achieves high accuracy while significantly reducing computational and parameter overhead. Specifically, our model requires only 98.82 M floating-point operations (FLOPs) and 0.52 M parameters, with an inference time of 47.49 ms per image on a CPU. Compared to the recently proposed lightweight segmentation network UNeXt, our model not only outperforms it in terms of the Dice metric but also has only 1/3 of the parameters and 1/22 of the FLOPs. In addition, our model achieves a 2.4 times faster inference speed than UNeXt, demonstrating comprehensive improvements in both accuracy and speed. Our model effectively reduces parameter count and computational complexity, improving the inference speed while maintaining comparable accuracy. The source code will be available at https://github.com/ouyangshuiming/SwinDNet.
期刊介绍:
omputer Assisted Surgery aims to improve patient care by advancing the utilization of computers during treatment; to evaluate the benefits and risks associated with the integration of advanced digital technologies into surgical practice; to disseminate clinical and basic research relevant to stereotactic surgery, minimal access surgery, endoscopy, and surgical robotics; to encourage interdisciplinary collaboration between engineers and physicians in developing new concepts and applications; to educate clinicians about the principles and techniques of computer assisted surgery and therapeutics; and to serve the international scientific community as a medium for the transfer of new information relating to theory, research, and practice in biomedical imaging and the surgical specialties.
The scope of Computer Assisted Surgery encompasses all fields within surgery, as well as biomedical imaging and instrumentation, and digital technology employed as an adjunct to imaging in diagnosis, therapeutics, and surgery. Topics featured include frameless as well as conventional stereotactic procedures, surgery guided by intraoperative ultrasound or magnetic resonance imaging, image guided focused irradiation, robotic surgery, and any therapeutic interventions performed with the use of digital imaging technology.