Application of deep learning for semantic segmentation in robotic prostatectomy: Comparison of convolutional neural networks and visual transformers.

IF 2.1 3区医学 Q2 UROLOGY & NEPHROLOGY

Investigative and Clinical Urology Pub Date : 2024-11-01 DOI:10.4111/icu.20240159

Sahyun Pak, Sung Gon Park, Jeonghyun Park, Hong Rock Choi, Jun Ho Lee, Wonchul Lee, Sung Tae Cho, Young Goo Lee, Hanjong Ahn

{"title":"Application of deep learning for semantic segmentation in robotic prostatectomy: Comparison of convolutional neural networks and visual transformers.","authors":"Sahyun Pak, Sung Gon Park, Jeonghyun Park, Hong Rock Choi, Jun Ho Lee, Wonchul Lee, Sung Tae Cho, Young Goo Lee, Hanjong Ahn","doi":"10.4111/icu.20240159","DOIUrl":null,"url":null,"abstract":"Purpose: Semantic segmentation is a fundamental part of the surgical application of deep learning. Traditionally, segmentation in vision tasks has been performed using convolutional neural networks (CNNs), but the transformer architecture has recently been introduced and widely investigated. We aimed to investigate the performance of deep learning models in segmentation in robot-assisted radical prostatectomy (RARP) and identify which of the architectures is superior for segmentation in robotic surgery.Materials and methods: Intraoperative images during RARP were obtained. The dataset was randomly split into training and validation data. Segmentation of the surgical instruments, bladder, prostate, vas and seminal vesicle was performed using three CNN models (DeepLabv3, MANet, and U-Net++) and three transformers (SegFormer, BEiT, and DPT), and their performances were analyzed.Results: The overall segmentation performance during RARP varied across different model architectures. For the CNN models, DeepLabV3 achieved a mean Dice score of 0.938, MANet scored 0.944, and U-Net++ reached 0.930. For the transformer architectures, SegFormer attained a mean Dice score of 0.919, BEiT scored 0.916, and DPT achieved 0.940. The performance of CNN models was superior to that of transformer models in segmenting the prostate, vas, and seminal vesicle.Conclusions: Deep learning models provided accurate segmentation of the surgical instruments and anatomical structures observed during RARP. Both CNN and transformer models showed reliable predictions in the segmentation task; however, CNN models may be more suitable than transformer models for organ segmentation and may be more applicable in unusual cases. Further research with large datasets is needed.","PeriodicalId":14522,"journal":{"name":"Investigative and Clinical Urology","volume":"65 6","pages":"551-558"},"PeriodicalIF":2.1000,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11543645/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Investigative and Clinical Urology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.4111/icu.20240159","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"UROLOGY & NEPHROLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Purpose: Semantic segmentation is a fundamental part of the surgical application of deep learning. Traditionally, segmentation in vision tasks has been performed using convolutional neural networks (CNNs), but the transformer architecture has recently been introduced and widely investigated. We aimed to investigate the performance of deep learning models in segmentation in robot-assisted radical prostatectomy (RARP) and identify which of the architectures is superior for segmentation in robotic surgery.

Materials and methods: Intraoperative images during RARP were obtained. The dataset was randomly split into training and validation data. Segmentation of the surgical instruments, bladder, prostate, vas and seminal vesicle was performed using three CNN models (DeepLabv3, MANet, and U-Net++) and three transformers (SegFormer, BEiT, and DPT), and their performances were analyzed.

Results: The overall segmentation performance during RARP varied across different model architectures. For the CNN models, DeepLabV3 achieved a mean Dice score of 0.938, MANet scored 0.944, and U-Net++ reached 0.930. For the transformer architectures, SegFormer attained a mean Dice score of 0.919, BEiT scored 0.916, and DPT achieved 0.940. The performance of CNN models was superior to that of transformer models in segmenting the prostate, vas, and seminal vesicle.

Conclusions: Deep learning models provided accurate segmentation of the surgical instruments and anatomical structures observed during RARP. Both CNN and transformer models showed reliable predictions in the segmentation task; however, CNN models may be more suitable than transformer models for organ segmentation and may be more applicable in unusual cases. Further research with large datasets is needed.

查看原文本刊更多论文

在机器人前列腺切除术中应用深度学习进行语义分割：卷积神经网络与视觉转换器的比较。

目的：语义分割是深度学习外科应用的基础部分。传统上，视觉任务中的分割是使用卷积神经网络（CNN）进行的，但最近引入了变换器架构，并对其进行了广泛研究。我们旨在研究深度学习模型在机器人辅助根治性前列腺切除术（RARP）中的分割性能，并确定哪种架构更适合机器人手术中的分割：获取 RARP 手术过程中的术中图像。数据集随机分为训练数据和验证数据。使用三种 CNN 模型（DeepLabv3、MANet 和 U-Net++）和三种变换器（SegFormer、BEiT 和 DPT）对手术器械、膀胱、前列腺、输精管和精囊进行分割，并分析它们的性能：不同模型架构在 RARP 期间的整体分割性能各不相同。在 CNN 模型中，DeepLabV3 的平均 Dice 得分为 0.938，MANet 得分为 0.944，U-Net++ 则达到 0.930。变换器架构方面，SegFormer 的平均 Dice 得分为 0.919，BEiT 得分为 0.916，DPT 得分为 0.940。在分割前列腺、输精管和精囊方面，CNN 模型的表现优于变换器模型：结论：深度学习模型能准确分割 RARP 期间观察到的手术器械和解剖结构。在分割任务中，CNN 和变换器模型都显示出了可靠的预测结果；不过，CNN 模型可能比变换器模型更适合器官分割，也更适用于特殊情况。还需要对大型数据集进行进一步研究。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Investigative and Clinical Urology Medicine-Urology

CiteScore

4.10

自引率

4.30%

发文量

审稿时长

4 weeks

期刊介绍： Investigative and Clinical Urology (Investig Clin Urol, ICUrology) is an international, peer-reviewed, platinum open access journal published bimonthly. ICUrology aims to provide outstanding scientific and clinical research articles, that will advance knowledge and understanding of urological diseases and current therapeutic treatments. ICUrology publishes Original Articles, Rapid Communications, Review Articles, Special Articles, Innovations in Urology, Editorials, and Letters to the Editor, with a focus on the following areas of expertise: • Precision Medicine in Urology • Urological Oncology • Robotics/Laparoscopy • Endourology/Urolithiasis • Lower Urinary Tract Dysfunction • Female Urology • Sexual Dysfunction/Infertility • Infection/Inflammation • Reconstruction/Transplantation • Geriatric Urology • Pediatric Urology • Basic/Translational Research One of the notable features of ICUrology is the application of multimedia platforms facilitating easy-to-access online video clips of newly developed surgical techniques from the journal''s website, by a QR (quick response) code located in the article, or via YouTube. ICUrology provides current and highly relevant knowledge to a broad audience at the cutting edge of urological research and clinical practice.