Speech-Vision Based Multi-Modal AI Control of a Magnetic Anchored and Actuated Endoscope

2022 IEEE International Conference on Robotics and Biomimetics (ROBIO) Pub Date : 2022-12-05 DOI:10.1109/ROBIO55434.2022.10011904

Jixiu Li, Yisen Huang, W. Ng, T. Cheng, Xixin Wu, Q. Dou, Helen M. Meng, P. Heng, Yunhui Liu, S. Chan, D. Navarro-Alarcon, Calvin Sze Hang Ng, Philip Wai Yan Chiu, Zheng Li

{"title":"Speech-Vision Based Multi-Modal AI Control of a Magnetic Anchored and Actuated Endoscope","authors":"Jixiu Li, Yisen Huang, W. Ng, T. Cheng, Xixin Wu, Q. Dou, Helen M. Meng, P. Heng, Yunhui Liu, S. Chan, D. Navarro-Alarcon, Calvin Sze Hang Ng, Philip Wai Yan Chiu, Zheng Li","doi":"10.1109/ROBIO55434.2022.10011904","DOIUrl":null,"url":null,"abstract":"In minimally invasive surgery (MIS), controlling the endoscope view is crucial for the operation. Many robotic endoscope holders were developed aiming to address this prob-lem,. These systems rely on joystick, foot pedal, simple voice command, etc. to control the robot. These methods requires surgeons extra effort and are not intuitive enough. In this paper, we propose a speech-vision based multi-modal AI approach, which integrates deep learning based instrument detection, automatic speech recognition and robot visual servo control. Surgeons could communicate with the endoscope by speech to indicate their view preference, such as the instrument to be tracked. The instrument is detected by the deep learning neural network. Then the endoscope takes the detected instrument as the target and follows it with the visual servo controller. This method is applied to a magnetic anchored and guided endoscope and evaluated experimentally. Preliminary results demonstrated this approach is effective and requires little efforts for the surgeon to control the endoscope view intuitively.","PeriodicalId":151112,"journal":{"name":"2022 IEEE International Conference on Robotics and Biomimetics (ROBIO)","volume":"80 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Robotics and Biomimetics (ROBIO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ROBIO55434.2022.10011904","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

In minimally invasive surgery (MIS), controlling the endoscope view is crucial for the operation. Many robotic endoscope holders were developed aiming to address this prob-lem,. These systems rely on joystick, foot pedal, simple voice command, etc. to control the robot. These methods requires surgeons extra effort and are not intuitive enough. In this paper, we propose a speech-vision based multi-modal AI approach, which integrates deep learning based instrument detection, automatic speech recognition and robot visual servo control. Surgeons could communicate with the endoscope by speech to indicate their view preference, such as the instrument to be tracked. The instrument is detected by the deep learning neural network. Then the endoscope takes the detected instrument as the target and follows it with the visual servo controller. This method is applied to a magnetic anchored and guided endoscope and evaluated experimentally. Preliminary results demonstrated this approach is effective and requires little efforts for the surgeon to control the endoscope view intuitively.

查看原文本刊更多论文

基于语音视觉的磁锚定驱动内窥镜多模态人工智能控制

在微创手术(MIS)中，控制内窥镜视野对手术至关重要。许多机器人内窥镜支架都是为了解决这个问题而开发的。这些系统依靠操纵杆、脚踏板、简单的语音指令等来控制机器人。这些方法需要外科医生额外的努力，而且不够直观。在本文中，我们提出了一种基于语音视觉的多模态人工智能方法，该方法集成了基于深度学习的仪器检测、自动语音识别和机器人视觉伺服控制。外科医生可以通过语音与内窥镜交流，以表明他们的视图偏好，例如要跟踪的仪器。采用深度学习神经网络对仪器进行检测。内窥镜以被检测仪器为目标，通过视觉伺服控制器对其进行跟踪。将该方法应用于磁锚定引导内窥镜，并进行了实验验证。初步结果表明，该方法是有效的，并且只需很少的努力，外科医生可以直观地控制内窥镜视图。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 IEEE International Conference on Robotics and Biomimetics (ROBIO)

自引率

0.00%

发文量