A real-time approach for surgical activity recognition and prediction based on transformer models in robot-assisted surgery.

IF 2.3 3区医学 Q3 ENGINEERING, BIOMEDICAL

International Journal of Computer Assisted Radiology and Surgery Pub Date : 2025-04-01 Epub Date: 2025-01-12 DOI:10.1007/s11548-024-03306-9

Ketai Chen, D S V Bandara, Jumpei Arata

{"title":"A real-time approach for surgical activity recognition and prediction based on transformer models in robot-assisted surgery.","authors":"Ketai Chen, D S V Bandara, Jumpei Arata","doi":"10.1007/s11548-024-03306-9","DOIUrl":null,"url":null,"abstract":"Purpose: This paper presents a deep learning approach to recognize and predict surgical activity in robot-assisted minimally invasive surgery (RAMIS). Our primary objective is to deploy the developed model for implementing a real-time surgical risk monitoring system within the realm of RAMIS.Methods: We propose a modified Transformer model with the architecture comprising no positional encoding, 5 fully connected layers, 1 encoder, and 3 decoders. This model is specifically designed to address 3 primary tasks in surgical robotics: gesture recognition, prediction, and end-effector trajectory prediction. Notably, it operates solely on kinematic data obtained from the joints of robotic arm.Results: The model's performance was evaluated on JHU-ISI Gesture and Skill Assessment Working Set dataset, achieving highest accuracy of 94.4% for gesture recognition, 84.82% for gesture prediction, and significantly low distance error of 1.34 mm with a prediction of 1 s in advance. Notably, the computational time per iteration was minimal recorded at only 4.2 ms.Conclusion: The results demonstrated the excellence of our proposed model compared to previous studies highlighting its potential for integration in real-time systems. We firmly believe that our model could significantly elevate realms of surgical activity recognition and prediction within RAS and make a substantial and meaningful contribution to the healthcare sector.","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":"743-752"},"PeriodicalIF":2.3000,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Computer Assisted Radiology and Surgery","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1007/s11548-024-03306-9","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/12 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}

引用次数: 0

Abstract

Purpose: This paper presents a deep learning approach to recognize and predict surgical activity in robot-assisted minimally invasive surgery (RAMIS). Our primary objective is to deploy the developed model for implementing a real-time surgical risk monitoring system within the realm of RAMIS.

Methods: We propose a modified Transformer model with the architecture comprising no positional encoding, 5 fully connected layers, 1 encoder, and 3 decoders. This model is specifically designed to address 3 primary tasks in surgical robotics: gesture recognition, prediction, and end-effector trajectory prediction. Notably, it operates solely on kinematic data obtained from the joints of robotic arm.

Results: The model's performance was evaluated on JHU-ISI Gesture and Skill Assessment Working Set dataset, achieving highest accuracy of 94.4% for gesture recognition, 84.82% for gesture prediction, and significantly low distance error of 1.34 mm with a prediction of 1 s in advance. Notably, the computational time per iteration was minimal recorded at only 4.2 ms.

Conclusion: The results demonstrated the excellence of our proposed model compared to previous studies highlighting its potential for integration in real-time systems. We firmly believe that our model could significantly elevate realms of surgical activity recognition and prediction within RAS and make a substantial and meaningful contribution to the healthcare sector.

查看原文本刊更多论文

机器人辅助手术中基于变压器模型的手术活动实时识别与预测方法。

目的：本文提出了一种用于机器人辅助微创手术（RAMIS）手术活动识别和预测的深度学习方法。我们的主要目标是部署开发的模型，在RAMIS领域内实施实时手术风险监测系统。方法：我们提出了一个改进的Transformer模型，其架构包括无位置编码，5个完全连接层，1个编码器和3个解码器。该模型专门用于解决手术机器人中的3个主要任务：手势识别、预测和末端执行器轨迹预测。值得注意的是，它仅对从机械臂关节获得的运动学数据进行操作。结果：在JHU-ISI手势和技能评估工作集数据集上对该模型的性能进行了评估，手势识别准确率最高，为94.4%，手势预测准确率为84.82%，距离误差显著降低，为1.34 mm，预测时间提前1 s。值得注意的是，每次迭代的计算时间是最小的，只有4.2毫秒。结论：与之前的研究相比，结果证明了我们提出的模型的卓越性，突出了它在实时系统集成中的潜力。我们坚信，我们的模型可以显著提升RAS内手术活动识别和预测领域，并为医疗保健部门做出实质性和有意义的贡献。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International Journal of Computer Assisted Radiology and Surgery ENGINEERING, BIOMEDICAL-RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING

CiteScore

5.90

自引率

6.70%

发文量

243

审稿时长

6-12 weeks

期刊介绍： The International Journal for Computer Assisted Radiology and Surgery (IJCARS) is a peer-reviewed journal that provides a platform for closing the gap between medical and technical disciplines, and encourages interdisciplinary research and development activities in an international environment.