Wenting Xu, Ruiguo Liu, Weifeng Zhang, Z. Chao, F. Jia
{"title":"基于多尺度信息融合的手术动作与器械检测","authors":"Wenting Xu, Ruiguo Liu, Weifeng Zhang, Z. Chao, F. Jia","doi":"10.1109/ICCRD51685.2021.9386349","DOIUrl":null,"url":null,"abstract":"The detection of surgical actions and instruments plays a very important role in computer-assisted endoscopic surgery. However, organ deformation and narrow surgical field increase the task difficulty. Accordingly, the problems of the detection of surgical actions and instruments have not been solved yet. In this paper, we proposed a multiscale fusion feature pyramid network (MSF-FPN) to merge low-level semantic information and high-level semantic information. Firstly, the feature map effectively aggregates the information by the initial layer of the pyramid network, and then diverges after the cross-transmission of the feature information in the middle layer. Finally, a strong semantic feature map was obtained in the output layer. Experiments verified that the average precision of the proposed MSF-FPN on the public endoscopic surgeon action detection (ESAD) dataset is increased by 2.9% and 1.5% compared with the general FPN and path aggregation network (PANet), and the average precision on the proposed cataract-based object detection (COD) dataset is increased by 4.3% and 2.6%, respectively.","PeriodicalId":294200,"journal":{"name":"2021 IEEE 13th International Conference on Computer Research and Development (ICCRD)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Surgical Action and Instrument Detection Based on Multiscale Information Fusion\",\"authors\":\"Wenting Xu, Ruiguo Liu, Weifeng Zhang, Z. Chao, F. Jia\",\"doi\":\"10.1109/ICCRD51685.2021.9386349\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The detection of surgical actions and instruments plays a very important role in computer-assisted endoscopic surgery. However, organ deformation and narrow surgical field increase the task difficulty. Accordingly, the problems of the detection of surgical actions and instruments have not been solved yet. In this paper, we proposed a multiscale fusion feature pyramid network (MSF-FPN) to merge low-level semantic information and high-level semantic information. Firstly, the feature map effectively aggregates the information by the initial layer of the pyramid network, and then diverges after the cross-transmission of the feature information in the middle layer. Finally, a strong semantic feature map was obtained in the output layer. Experiments verified that the average precision of the proposed MSF-FPN on the public endoscopic surgeon action detection (ESAD) dataset is increased by 2.9% and 1.5% compared with the general FPN and path aggregation network (PANet), and the average precision on the proposed cataract-based object detection (COD) dataset is increased by 4.3% and 2.6%, respectively.\",\"PeriodicalId\":294200,\"journal\":{\"name\":\"2021 IEEE 13th International Conference on Computer Research and Development (ICCRD)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-01-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE 13th International Conference on Computer Research and Development (ICCRD)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCRD51685.2021.9386349\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 13th International Conference on Computer Research and Development (ICCRD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCRD51685.2021.9386349","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Surgical Action and Instrument Detection Based on Multiscale Information Fusion
The detection of surgical actions and instruments plays a very important role in computer-assisted endoscopic surgery. However, organ deformation and narrow surgical field increase the task difficulty. Accordingly, the problems of the detection of surgical actions and instruments have not been solved yet. In this paper, we proposed a multiscale fusion feature pyramid network (MSF-FPN) to merge low-level semantic information and high-level semantic information. Firstly, the feature map effectively aggregates the information by the initial layer of the pyramid network, and then diverges after the cross-transmission of the feature information in the middle layer. Finally, a strong semantic feature map was obtained in the output layer. Experiments verified that the average precision of the proposed MSF-FPN on the public endoscopic surgeon action detection (ESAD) dataset is increased by 2.9% and 1.5% compared with the general FPN and path aggregation network (PANet), and the average precision on the proposed cataract-based object detection (COD) dataset is increased by 4.3% and 2.6%, respectively.