Yuan Ding, Jingfan Fan, Kun Pang, Heng Li, Tianyu Fu, Hong Song, Lingfeng Chen, Jian Yang
{"title":"Surgical Workflow Recognition Using Two-Stream Mixed Convolution Network","authors":"Yuan Ding, Jingfan Fan, Kun Pang, Heng Li, Tianyu Fu, Hong Song, Lingfeng Chen, Jian Yang","doi":"10.1109/AEMCSE50948.2020.00064","DOIUrl":null,"url":null,"abstract":"Surgical workflow recognition is the prerequisite for automatic indexing of surgical video databases and optimization of real-time operating scheduling, which is an important part of the modern operating room (OR). In this paper, we propose a surgical phase recognition method based on a two-stream mixed convolutional network (TsMCNet) to automatically recognize surgical workflow. TsMCNet optimizes the visual and temporal features learned from surgical videos by integrating 2D and 3D convolutional networks (CNNs) to form a spatio-temporal complementary architecture. Specifically, temporal branch (3D CNN) is responsible for learning the spatio-temporal features among adjacent frames, whereas the parallel visual branch (2D CNN) is focused on capturing the deep visual features of each frame. Extensive experiments on a public surgical video dataset (MICCAI 2016 Workflow Challenge) demonstrated outstanding performance of our proposed method, exceeding that of state-of-the-art methods (e.g., 86.2% accuracy and 83.0% F1 score).","PeriodicalId":246841,"journal":{"name":"2020 3rd International Conference on Advanced Electronic Materials, Computers and Software Engineering (AEMCSE)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 3rd International Conference on Advanced Electronic Materials, Computers and Software Engineering (AEMCSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AEMCSE50948.2020.00064","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Surgical workflow recognition is the prerequisite for automatic indexing of surgical video databases and optimization of real-time operating scheduling, which is an important part of the modern operating room (OR). In this paper, we propose a surgical phase recognition method based on a two-stream mixed convolutional network (TsMCNet) to automatically recognize surgical workflow. TsMCNet optimizes the visual and temporal features learned from surgical videos by integrating 2D and 3D convolutional networks (CNNs) to form a spatio-temporal complementary architecture. Specifically, temporal branch (3D CNN) is responsible for learning the spatio-temporal features among adjacent frames, whereas the parallel visual branch (2D CNN) is focused on capturing the deep visual features of each frame. Extensive experiments on a public surgical video dataset (MICCAI 2016 Workflow Challenge) demonstrated outstanding performance of our proposed method, exceeding that of state-of-the-art methods (e.g., 86.2% accuracy and 83.0% F1 score).