Automatic recognition of surgical phase of robot-assisted radical prostatectomy based on artificial intelligence deep-learning model and its application in surgical skill evaluation: a joint study of 18 medical education centers.

IF 2.4 2区医学 Q2 SURGERY

Surgical Endoscopy And Other Interventional Techniques Pub Date : 2025-07-10 DOI:10.1007/s00464-025-11967-z

Xue Zhao, Shin Takenaka, Shuntaro Iuchi, Daichi Kitaguchi, Masashi Wakabayashi, Kodai Sato, Shintaro Arakaki, Kimimasa Sasaki, Norihito Kosugi, Nobushige Takeshita, Nobuyoshi Takeshita, Shinichi Sakamoto, Tomohiko Ichikawa, Masaaki Ito

{"title":"Automatic recognition of surgical phase of robot-assisted radical prostatectomy based on artificial intelligence deep-learning model and its application in surgical skill evaluation: a joint study of 18 medical education centers.","authors":"Xue Zhao, Shin Takenaka, Shuntaro Iuchi, Daichi Kitaguchi, Masashi Wakabayashi, Kodai Sato, Shintaro Arakaki, Kimimasa Sasaki, Norihito Kosugi, Nobushige Takeshita, Nobuyoshi Takeshita, Shinichi Sakamoto, Tomohiko Ichikawa, Masaaki Ito","doi":"10.1007/s00464-025-11967-z","DOIUrl":null,"url":null,"abstract":"Background: Surgical proficiency influences surgical quality and patient outcomes in robot-assisted radical prostatectomy (RARP). Manual video evaluations are labor-intensive and lack standardized objective metrics. Herein, we aimed to develop an artificial intelligence (AI) deep-learning model that can identify the surgical phases in RARP videos and create a parameter-based scoring system to distinguish experts from novice surgeons based on the results of the AI model.Methods: A dataset of 410 RARP videos from 18 Japanese medical institutions was analyzed. The videos were annotated into 11 phases and divided into training and testing sets. Surgeons were categorized as experts or novices based on their RARP experience. We developed a deep-learning-based surgical phase classification model and compared the phase duration, number of transitions between phases, and AI confidence scores (AICS) between the groups based on the model's output. Key parameters were standardized and identified using stepwise multivariate logistic regression. A surgical skill scoring system was constructed based on the receiver operating characteristic curve cut-off values.Results: Of the 213 videos, 99 were used for training, 20 for validation, and 94 for testing (61 experts and 33 novices). The model achieved an accuracy of 0.89 in identifying surgical phases. The experts had significantly shorter durations in phases 2-8 and higher AICS than the novices. Stepwise analysis identified phases 2 (Retzius space expansion), 7 (dorsal venous complex incision, apex treatment, hemostasis), and 8 (urethrovesical anastomosis) and the AICS as key predictors of expertise. The scoring system developed from these variables effectively distinguished experts from novices with an accuracy of 86.2%.Conclusions: The developed AI model revealed that the duration of several surgical phases and AICS are key parameters in assessing surgical skill proficiency in RARP. The new scoring system established based on these indicators reliably differentiates expert from novice surgeons.","PeriodicalId":22174,"journal":{"name":"Surgical Endoscopy And Other Interventional Techniques","volume":" ","pages":""},"PeriodicalIF":2.4000,"publicationDate":"2025-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Surgical Endoscopy And Other Interventional Techniques","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s00464-025-11967-z","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"SURGERY","Score":null,"Total":0}

引用次数: 0

Abstract

Background: Surgical proficiency influences surgical quality and patient outcomes in robot-assisted radical prostatectomy (RARP). Manual video evaluations are labor-intensive and lack standardized objective metrics. Herein, we aimed to develop an artificial intelligence (AI) deep-learning model that can identify the surgical phases in RARP videos and create a parameter-based scoring system to distinguish experts from novice surgeons based on the results of the AI model.

Methods: A dataset of 410 RARP videos from 18 Japanese medical institutions was analyzed. The videos were annotated into 11 phases and divided into training and testing sets. Surgeons were categorized as experts or novices based on their RARP experience. We developed a deep-learning-based surgical phase classification model and compared the phase duration, number of transitions between phases, and AI confidence scores (AICS) between the groups based on the model's output. Key parameters were standardized and identified using stepwise multivariate logistic regression. A surgical skill scoring system was constructed based on the receiver operating characteristic curve cut-off values.

Results: Of the 213 videos, 99 were used for training, 20 for validation, and 94 for testing (61 experts and 33 novices). The model achieved an accuracy of 0.89 in identifying surgical phases. The experts had significantly shorter durations in phases 2-8 and higher AICS than the novices. Stepwise analysis identified phases 2 (Retzius space expansion), 7 (dorsal venous complex incision, apex treatment, hemostasis), and 8 (urethrovesical anastomosis) and the AICS as key predictors of expertise. The scoring system developed from these variables effectively distinguished experts from novices with an accuracy of 86.2%.

Conclusions: The developed AI model revealed that the duration of several surgical phases and AICS are key parameters in assessing surgical skill proficiency in RARP. The new scoring system established based on these indicators reliably differentiates expert from novice surgeons.

查看原文本刊更多论文

基于人工智能深度学习模型的机器人辅助根治性前列腺切除术手术阶段自动识别及其在手术技能评估中的应用——18个医学教育中心联合研究

背景：手术熟练程度影响机器人辅助根治性前列腺切除术（RARP）的手术质量和患者预后。手动视频评估是劳动密集型的，缺乏标准化的客观指标。本文旨在开发一种人工智能（AI）深度学习模型，该模型可以识别RARP视频中的手术阶段，并基于AI模型的结果创建一个基于参数的评分系统，以区分专家和新手外科医生。方法：对日本18家医疗机构的410个RARP视频数据集进行分析。视频被标注为11个阶段，分为训练集和测试集。外科医生根据他们的RARP经验分为专家和新手。我们开发了一个基于深度学习的手术阶段分类模型，并根据模型的输出比较了两组之间的阶段持续时间、阶段之间的过渡次数和人工智能置信度评分（AICS）。使用逐步多元逻辑回归对关键参数进行标准化和识别。基于受术者操作特征曲线截止值构建手术技能评分系统。结果：213个视频中，99个用于培训，20个用于验证，94个用于测试（专家61个，新手33个）。该模型识别手术分期的准确率为0.89。专家在第2-8阶段的持续时间明显短于新手，AICS高于新手。逐步分析确定第2阶段（Retzius间隙扩张）、第7阶段（背静脉复合体切口、顶点治疗、止血）、第8阶段（尿道膀胱吻合）和AICS是专业程度的关键预测因素。根据这些变量开发的评分系统有效地区分了专家和新手，准确率为86.2%。结论：建立的人工智能模型显示，几个手术阶段的持续时间和AICS是评估RARP手术技能熟练程度的关键参数。基于这些指标建立的新的评分系统可靠地区分了专家和新手外科医生。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Surgical Endoscopy And Other Interventional Techniques 医学-外科

CiteScore

6.10

自引率

12.90%

发文量

890

审稿时长

6 months

期刊介绍： Uniquely positioned at the interface between various medical and surgical disciplines, Surgical Endoscopy serves as a focal point for the international surgical community to exchange information on practice, theory, and research. Topics covered in the journal include: -Surgical aspects of: Interventional endoscopy, Ultrasound, Other techniques in the fields of gastroenterology, obstetrics, gynecology, and urology, -Gastroenterologic surgery -Thoracic surgery -Traumatic surgery -Orthopedic surgery -Pediatric surgery