A Reinforcement Learning Approach for Real-Time Articulated Surgical Instrument 3-D Pose Reconstruction

IF 3.4 Q2 ENGINEERING, BIOMEDICAL

IEEE transactions on medical robotics and bionics Pub Date : 2024-09-19 DOI:10.1109/TMRB.2024.3464089

Ke Fan;Ziyang Chen;Qiaoling Liu;Giancarlo Ferrigno;Elena De Momi

{"title":"A Reinforcement Learning Approach for Real-Time Articulated Surgical Instrument 3-D Pose Reconstruction","authors":"Ke Fan;Ziyang Chen;Qiaoling Liu;Giancarlo Ferrigno;Elena De Momi","doi":"10.1109/TMRB.2024.3464089","DOIUrl":null,"url":null,"abstract":"3D pose reconstruction of surgical instruments from images stands as a critical component in environment perception within robotic minimally invasive surgery (RMIS). The current deep learning methods rely on complex networks to enhance accuracy, making real-time implementation difficult. Moreover, diverging from a singular rigid body, surgical instruments exhibit an articulation structure, making the annotation of 3D poses more challenging. In this paper, we present a novel approach to formulate the 3D pose reconstruction of articulated surgical instruments as a Markov Decision Process (MDP). A Reinforcement Learning (RL) agent employs 2D image labels to control a virtual articulated skeleton to reproduce the 3D pose of the real surgical instrument. Firstly, a convolutional neural network is used to estimate the 2D pixel positions of joint nodes of the surgical instrument skeleton. Subsequently, the agent controls the 3D virtual articulated skeleton to align its joint nodes’ projections on the image plane with those in the real image. Validation of our proposed method is conducted using a semi-synthetic dataset with precise 3D pose labels and two real datasets, demonstrating the accuracy and efficacy of our approach. The results indicate the potential of our method in achieving real-time 3D pose reconstruction for articulated surgical instruments in the context of RMIS, addressing the challenges posed by low-texture surfaces and articulated structures.","PeriodicalId":73318,"journal":{"name":"IEEE transactions on medical robotics and bionics","volume":"6 4","pages":"1458-1467"},"PeriodicalIF":3.4000,"publicationDate":"2024-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on medical robotics and bionics","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10684243/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}

引用次数: 0

Abstract

3D pose reconstruction of surgical instruments from images stands as a critical component in environment perception within robotic minimally invasive surgery (RMIS). The current deep learning methods rely on complex networks to enhance accuracy, making real-time implementation difficult. Moreover, diverging from a singular rigid body, surgical instruments exhibit an articulation structure, making the annotation of 3D poses more challenging. In this paper, we present a novel approach to formulate the 3D pose reconstruction of articulated surgical instruments as a Markov Decision Process (MDP). A Reinforcement Learning (RL) agent employs 2D image labels to control a virtual articulated skeleton to reproduce the 3D pose of the real surgical instrument. Firstly, a convolutional neural network is used to estimate the 2D pixel positions of joint nodes of the surgical instrument skeleton. Subsequently, the agent controls the 3D virtual articulated skeleton to align its joint nodes’ projections on the image plane with those in the real image. Validation of our proposed method is conducted using a semi-synthetic dataset with precise 3D pose labels and two real datasets, demonstrating the accuracy and efficacy of our approach. The results indicate the potential of our method in achieving real-time 3D pose reconstruction for articulated surgical instruments in the context of RMIS, addressing the challenges posed by low-texture surfaces and articulated structures.

查看原文本刊更多论文

用于实时关节化手术器械三维姿态重构的强化学习方法

根据图像重建手术器械的三维姿态是机器人微创手术（RMIS）环境感知的关键组成部分。目前的深度学习方法依赖于复杂的网络来提高准确性，因此难以实时实施。此外，手术器械不同于单一的刚体，它具有铰接结构，这使得三维姿势的标注更具挑战性。在本文中，我们提出了一种新方法，将铰接手术器械的三维姿态重建表述为马尔可夫决策过程（Markov Decision Process，MDP）。强化学习（RL）代理利用二维图像标签来控制虚拟关节骨架，以重现真实手术器械的三维姿势。首先，使用卷积神经网络估计手术器械骨架关节节点的二维像素位置。然后，代理控制三维虚拟关节骨架，使其关节节点在图像平面上的投影与真实图像中的投影保持一致。我们使用带有精确三维姿态标签的半合成数据集和两个真实数据集对我们提出的方法进行了验证，证明了我们方法的准确性和有效性。结果表明，我们的方法可以在 RMIS 的背景下实现铰接式手术器械的实时三维姿态重建，解决低纹理表面和铰接式结构带来的挑战。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE transactions on medical robotics and bionics

CiteScore

6.80

自引率

0.00%

发文量