内窥镜手术器械的双阶段语义分割

IF 3.2 2区 医学 Q1 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING
Medical physics Pub Date : 2024-09-10 DOI:10.1002/mp.17397
Wenxin Chen, Kaifeng Wang, Xinya Song, Dongsheng Xie, Xue Li, Mobarakol Islam, Changsheng Li, Xingguang Duan
{"title":"内窥镜手术器械的双阶段语义分割","authors":"Wenxin Chen, Kaifeng Wang, Xinya Song, Dongsheng Xie, Xue Li, Mobarakol Islam, Changsheng Li, Xingguang Duan","doi":"10.1002/mp.17397","DOIUrl":null,"url":null,"abstract":"BackgroundEndoscopic instrument segmentation is essential for ensuring the safety of robotic‐assisted spinal endoscopic surgeries. However, due to the narrow operative region, intricate surrounding tissues, and limited visibility, achieving instrument segmentation within the endoscopic view remains challenging.PurposeThis work aims to devise a method to segment surgical instruments in endoscopic video. By designing an endoscopic image classification model, features of frames before and after the video are extracted to achieve continuous and precise segmentation of instruments in endoscopic videos.MethodsDeep learning techniques serve as the algorithmic core for constructing the convolutional neural network proposed in this study. The method comprises dual stages: image classification and instrument segmentation. MobileViT is employed for image classification, enabling the extraction of key features of different instruments and generating classification results. DeepLabv3+ is utilized for instrument segmentation. By training on distinct instruments separately, corresponding model parameters are obtained. Lastly, a flag caching mechanism along with a blur detection module is designed to effectively utilize the image features in consecutive frames. By incorporating specific parameters into the segmentation model, better segmentation of surgical instruments can be achieved in endoscopic videos.ResultsThe classification and segmentation models are evaluated on an endoscopic image dataset. In the dataset used for instrument segmentation, the training set consists of 7456 images, the validation set consists of 829 images, and the test set consists of 921 images. In the dataset used for image classification, the training set consists of 2400 images and the validation set consists of 600 images. The image classification model achieves an accuracy of 70% on the validation set. For the segmentation model, experiments are conducted on two common surgical instruments, and the mean Intersection over Union (mIoU) exceeds 98%. Furthermore, the proposed video segmentation method is tested using videos collected during surgeries, validating the effectiveness of the flag caching mechanism and blur detection module.ConclusionsExperimental results on the dataset demonstrate that the dual‐stage video processing method excels in performing instrument segmentation tasks under endoscopic conditions. This advancement is significant for enhancing the intelligence level of robotic‐assisted spinal endoscopic surgeries.","PeriodicalId":18384,"journal":{"name":"Medical physics","volume":"9 1","pages":""},"PeriodicalIF":3.2000,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Dual‐stage semantic segmentation of endoscopic surgical instruments\",\"authors\":\"Wenxin Chen, Kaifeng Wang, Xinya Song, Dongsheng Xie, Xue Li, Mobarakol Islam, Changsheng Li, Xingguang Duan\",\"doi\":\"10.1002/mp.17397\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"BackgroundEndoscopic instrument segmentation is essential for ensuring the safety of robotic‐assisted spinal endoscopic surgeries. However, due to the narrow operative region, intricate surrounding tissues, and limited visibility, achieving instrument segmentation within the endoscopic view remains challenging.PurposeThis work aims to devise a method to segment surgical instruments in endoscopic video. By designing an endoscopic image classification model, features of frames before and after the video are extracted to achieve continuous and precise segmentation of instruments in endoscopic videos.MethodsDeep learning techniques serve as the algorithmic core for constructing the convolutional neural network proposed in this study. The method comprises dual stages: image classification and instrument segmentation. MobileViT is employed for image classification, enabling the extraction of key features of different instruments and generating classification results. DeepLabv3+ is utilized for instrument segmentation. By training on distinct instruments separately, corresponding model parameters are obtained. Lastly, a flag caching mechanism along with a blur detection module is designed to effectively utilize the image features in consecutive frames. By incorporating specific parameters into the segmentation model, better segmentation of surgical instruments can be achieved in endoscopic videos.ResultsThe classification and segmentation models are evaluated on an endoscopic image dataset. In the dataset used for instrument segmentation, the training set consists of 7456 images, the validation set consists of 829 images, and the test set consists of 921 images. In the dataset used for image classification, the training set consists of 2400 images and the validation set consists of 600 images. The image classification model achieves an accuracy of 70% on the validation set. For the segmentation model, experiments are conducted on two common surgical instruments, and the mean Intersection over Union (mIoU) exceeds 98%. Furthermore, the proposed video segmentation method is tested using videos collected during surgeries, validating the effectiveness of the flag caching mechanism and blur detection module.ConclusionsExperimental results on the dataset demonstrate that the dual‐stage video processing method excels in performing instrument segmentation tasks under endoscopic conditions. This advancement is significant for enhancing the intelligence level of robotic‐assisted spinal endoscopic surgeries.\",\"PeriodicalId\":18384,\"journal\":{\"name\":\"Medical physics\",\"volume\":\"9 1\",\"pages\":\"\"},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2024-09-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Medical physics\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1002/mp.17397\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical physics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1002/mp.17397","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0

摘要

背景内窥镜器械分割对于确保机器人辅助脊柱内窥镜手术的安全性至关重要。然而,由于手术区域狭窄、周围组织错综复杂以及能见度有限,在内窥镜视图内实现器械分割仍具有挑战性。方法深度学习技术是构建本研究提出的卷积神经网络的算法核心。该方法包括两个阶段:图像分类和器械分割。图像分类采用 MobileViT,可提取不同器械的关键特征并生成分类结果。DeepLabv3+ 用于仪器分割。通过对不同仪器分别进行训练,可获得相应的模型参数。最后,还设计了标志缓存机制和模糊检测模块,以有效利用连续帧中的图像特征。通过在分割模型中加入特定参数,可以更好地分割内窥镜视频中的手术器械。在用于器械分割的数据集中,训练集由 7456 张图像组成,验证集由 829 张图像组成,测试集由 921 张图像组成。在用于图像分类的数据集中,训练集由 2400 幅图像组成,验证集由 600 幅图像组成。图像分类模型在验证集上的准确率达到 70%。在分割模型方面,对两种常见的手术器械进行了实验,其平均交叉联合率(mIoU)超过了 98%。结论数据集上的实验结果表明,双阶段视频处理方法在内窥镜条件下执行器械分割任务时表现出色。这一进步对于提高机器人辅助脊柱内窥镜手术的智能化水平意义重大。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Dual‐stage semantic segmentation of endoscopic surgical instruments
BackgroundEndoscopic instrument segmentation is essential for ensuring the safety of robotic‐assisted spinal endoscopic surgeries. However, due to the narrow operative region, intricate surrounding tissues, and limited visibility, achieving instrument segmentation within the endoscopic view remains challenging.PurposeThis work aims to devise a method to segment surgical instruments in endoscopic video. By designing an endoscopic image classification model, features of frames before and after the video are extracted to achieve continuous and precise segmentation of instruments in endoscopic videos.MethodsDeep learning techniques serve as the algorithmic core for constructing the convolutional neural network proposed in this study. The method comprises dual stages: image classification and instrument segmentation. MobileViT is employed for image classification, enabling the extraction of key features of different instruments and generating classification results. DeepLabv3+ is utilized for instrument segmentation. By training on distinct instruments separately, corresponding model parameters are obtained. Lastly, a flag caching mechanism along with a blur detection module is designed to effectively utilize the image features in consecutive frames. By incorporating specific parameters into the segmentation model, better segmentation of surgical instruments can be achieved in endoscopic videos.ResultsThe classification and segmentation models are evaluated on an endoscopic image dataset. In the dataset used for instrument segmentation, the training set consists of 7456 images, the validation set consists of 829 images, and the test set consists of 921 images. In the dataset used for image classification, the training set consists of 2400 images and the validation set consists of 600 images. The image classification model achieves an accuracy of 70% on the validation set. For the segmentation model, experiments are conducted on two common surgical instruments, and the mean Intersection over Union (mIoU) exceeds 98%. Furthermore, the proposed video segmentation method is tested using videos collected during surgeries, validating the effectiveness of the flag caching mechanism and blur detection module.ConclusionsExperimental results on the dataset demonstrate that the dual‐stage video processing method excels in performing instrument segmentation tasks under endoscopic conditions. This advancement is significant for enhancing the intelligence level of robotic‐assisted spinal endoscopic surgeries.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Medical physics
Medical physics 医学-核医学
CiteScore
6.80
自引率
15.80%
发文量
660
审稿时长
1.7 months
期刊介绍: Medical Physics publishes original, high impact physics, imaging science, and engineering research that advances patient diagnosis and therapy through contributions in 1) Basic science developments with high potential for clinical translation 2) Clinical applications of cutting edge engineering and physics innovations 3) Broadly applicable and innovative clinical physics developments Medical Physics is a journal of global scope and reach. By publishing in Medical Physics your research will reach an international, multidisciplinary audience including practicing medical physicists as well as physics- and engineering based translational scientists. We work closely with authors of promising articles to improve their quality.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信