Speech-Vision Based Multi-Modal AI Control of a Magnetic Anchored and Actuated Endoscope

Jixiu Li, Yisen Huang, W. Ng, T. Cheng, Xixin Wu, Q. Dou, Helen M. Meng, P. Heng, Yunhui Liu, S. Chan, D. Navarro-Alarcon, Calvin Sze Hang Ng, Philip Wai Yan Chiu, Zheng Li
{"title":"Speech-Vision Based Multi-Modal AI Control of a Magnetic Anchored and Actuated Endoscope","authors":"Jixiu Li, Yisen Huang, W. Ng, T. Cheng, Xixin Wu, Q. Dou, Helen M. Meng, P. Heng, Yunhui Liu, S. Chan, D. Navarro-Alarcon, Calvin Sze Hang Ng, Philip Wai Yan Chiu, Zheng Li","doi":"10.1109/ROBIO55434.2022.10011904","DOIUrl":null,"url":null,"abstract":"In minimally invasive surgery (MIS), controlling the endoscope view is crucial for the operation. Many robotic endoscope holders were developed aiming to address this prob-lem,. These systems rely on joystick, foot pedal, simple voice command, etc. to control the robot. These methods requires surgeons extra effort and are not intuitive enough. In this paper, we propose a speech-vision based multi-modal AI approach, which integrates deep learning based instrument detection, automatic speech recognition and robot visual servo control. Surgeons could communicate with the endoscope by speech to indicate their view preference, such as the instrument to be tracked. The instrument is detected by the deep learning neural network. Then the endoscope takes the detected instrument as the target and follows it with the visual servo controller. This method is applied to a magnetic anchored and guided endoscope and evaluated experimentally. Preliminary results demonstrated this approach is effective and requires little efforts for the surgeon to control the endoscope view intuitively.","PeriodicalId":151112,"journal":{"name":"2022 IEEE International Conference on Robotics and Biomimetics (ROBIO)","volume":"80 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Robotics and Biomimetics (ROBIO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ROBIO55434.2022.10011904","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

In minimally invasive surgery (MIS), controlling the endoscope view is crucial for the operation. Many robotic endoscope holders were developed aiming to address this prob-lem,. These systems rely on joystick, foot pedal, simple voice command, etc. to control the robot. These methods requires surgeons extra effort and are not intuitive enough. In this paper, we propose a speech-vision based multi-modal AI approach, which integrates deep learning based instrument detection, automatic speech recognition and robot visual servo control. Surgeons could communicate with the endoscope by speech to indicate their view preference, such as the instrument to be tracked. The instrument is detected by the deep learning neural network. Then the endoscope takes the detected instrument as the target and follows it with the visual servo controller. This method is applied to a magnetic anchored and guided endoscope and evaluated experimentally. Preliminary results demonstrated this approach is effective and requires little efforts for the surgeon to control the endoscope view intuitively.
基于语音视觉的磁锚定驱动内窥镜多模态人工智能控制
在微创手术(MIS)中,控制内窥镜视野对手术至关重要。许多机器人内窥镜支架都是为了解决这个问题而开发的。这些系统依靠操纵杆、脚踏板、简单的语音指令等来控制机器人。这些方法需要外科医生额外的努力,而且不够直观。在本文中,我们提出了一种基于语音视觉的多模态人工智能方法,该方法集成了基于深度学习的仪器检测、自动语音识别和机器人视觉伺服控制。外科医生可以通过语音与内窥镜交流,以表明他们的视图偏好,例如要跟踪的仪器。采用深度学习神经网络对仪器进行检测。内窥镜以被检测仪器为目标,通过视觉伺服控制器对其进行跟踪。将该方法应用于磁锚定引导内窥镜,并进行了实验验证。初步结果表明,该方法是有效的,并且只需很少的努力,外科医生可以直观地控制内窥镜视图。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信