Marcos Fernández-Rodríguez, Bruno Silva, Sandro Queirós, Helena R. Torres, Bruno Oliveira, P. Morais, L. R. Buschle, Jorge Correia-Pinto, Estevão Lima, João L. Vilaça
{"title":"探索将光学流纳入 nnU-Net 框架以进行手术器械分割","authors":"Marcos Fernández-Rodríguez, Bruno Silva, Sandro Queirós, Helena R. Torres, Bruno Oliveira, P. Morais, L. R. Buschle, Jorge Correia-Pinto, Estevão Lima, João L. Vilaça","doi":"10.1117/12.3006855","DOIUrl":null,"url":null,"abstract":"Surgical instrument segmentation in laparoscopy is essential for computer-assisted surgical systems. Despite the Deep Learning progress in recent years, the dynamic setting of laparoscopic surgery still presents challenges for precise segmentation. The nnU-Net framework excelled in semantic segmentation analyzing single frames without temporal information. The framework's ease of use, including its ability to be automatically configured, and its low expertise requirements, have made it a popular base framework for comparisons. Optical flow (OF) is a tool commonly used in video tasks to estimate motion and represent it in a single frame, containing temporal information. This work seeks to employ OF maps as an additional input to the nnU-Net architecture to improve its performance in the surgical instrument segmentation task, taking advantage of the fact that instruments are the main moving objects in the surgical field. With this new input, the temporal component would be indirectly added without modifying the architecture. Using CholecSeg8k dataset, three different representations of movement were estimated and used as new inputs, comparing them with a baseline model. Results showed that the use of OF maps improves the detection of classes with high movement, even when these are scarce in the dataset. To further improve performance, future work may focus on implementing other OF-preserving augmentations.","PeriodicalId":517504,"journal":{"name":"Medical Imaging 2024: Image-Guided Procedures, Robotic Interventions, and Modeling","volume":"52 4","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Exploring optical flow inclusion into nnU-Net framework for surgical instrument segmentation\",\"authors\":\"Marcos Fernández-Rodríguez, Bruno Silva, Sandro Queirós, Helena R. Torres, Bruno Oliveira, P. Morais, L. R. Buschle, Jorge Correia-Pinto, Estevão Lima, João L. Vilaça\",\"doi\":\"10.1117/12.3006855\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Surgical instrument segmentation in laparoscopy is essential for computer-assisted surgical systems. Despite the Deep Learning progress in recent years, the dynamic setting of laparoscopic surgery still presents challenges for precise segmentation. The nnU-Net framework excelled in semantic segmentation analyzing single frames without temporal information. The framework's ease of use, including its ability to be automatically configured, and its low expertise requirements, have made it a popular base framework for comparisons. Optical flow (OF) is a tool commonly used in video tasks to estimate motion and represent it in a single frame, containing temporal information. This work seeks to employ OF maps as an additional input to the nnU-Net architecture to improve its performance in the surgical instrument segmentation task, taking advantage of the fact that instruments are the main moving objects in the surgical field. With this new input, the temporal component would be indirectly added without modifying the architecture. Using CholecSeg8k dataset, three different representations of movement were estimated and used as new inputs, comparing them with a baseline model. Results showed that the use of OF maps improves the detection of classes with high movement, even when these are scarce in the dataset. To further improve performance, future work may focus on implementing other OF-preserving augmentations.\",\"PeriodicalId\":517504,\"journal\":{\"name\":\"Medical Imaging 2024: Image-Guided Procedures, Robotic Interventions, and Modeling\",\"volume\":\"52 4\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-03-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Medical Imaging 2024: Image-Guided Procedures, Robotic Interventions, and Modeling\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1117/12.3006855\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical Imaging 2024: Image-Guided Procedures, Robotic Interventions, and Modeling","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1117/12.3006855","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
腹腔镜手术中的手术器械分割对计算机辅助手术系统至关重要。尽管近年来深度学习取得了长足进步,但腹腔镜手术的动态环境仍给精确分割带来了挑战。nnU-Net 框架在分析无时间信息的单帧语义分割方面表现出色。该框架的易用性(包括自动配置的能力)和较低的专业知识要求,使其成为比较常用的基础框架。光流(OF)是视频任务中常用的一种工具,用于估计运动并在包含时间信息的单帧中表示运动。本研究试图将光学流图作为 nnU-Net 架构的额外输入,以提高其在手术器械分割任务中的性能,同时利用器械是手术现场主要移动物体这一事实。有了这一新的输入,就可以在不修改架构的情况下间接添加时间成分。利用 CholecSeg8k 数据集,对三种不同的运动表征进行了估算,并将其作为新的输入,与基线模型进行比较。结果显示,OF 地图的使用提高了对运动量大的类别的检测,即使这些类别在数据集中很少。为了进一步提高性能,未来的工作可能会侧重于实施其他 OF 保护增强技术。
Exploring optical flow inclusion into nnU-Net framework for surgical instrument segmentation
Surgical instrument segmentation in laparoscopy is essential for computer-assisted surgical systems. Despite the Deep Learning progress in recent years, the dynamic setting of laparoscopic surgery still presents challenges for precise segmentation. The nnU-Net framework excelled in semantic segmentation analyzing single frames without temporal information. The framework's ease of use, including its ability to be automatically configured, and its low expertise requirements, have made it a popular base framework for comparisons. Optical flow (OF) is a tool commonly used in video tasks to estimate motion and represent it in a single frame, containing temporal information. This work seeks to employ OF maps as an additional input to the nnU-Net architecture to improve its performance in the surgical instrument segmentation task, taking advantage of the fact that instruments are the main moving objects in the surgical field. With this new input, the temporal component would be indirectly added without modifying the architecture. Using CholecSeg8k dataset, three different representations of movement were estimated and used as new inputs, comparing them with a baseline model. Results showed that the use of OF maps improves the detection of classes with high movement, even when these are scarce in the dataset. To further improve performance, future work may focus on implementing other OF-preserving augmentations.