{"title":"基于焦调制引导卷积神经网络的视频胶囊内窥镜分类。","authors":"Abhishek Srivastava, Nikhil Kumar Tomar, Ulas Bagci, Debesh Jha","doi":"10.1109/CBMS55023.2022.00064","DOIUrl":null,"url":null,"abstract":"<p><p>Video capsule endoscopy is a hot topic in computer vision and medicine. Deep learning can have a positive impact on the future of video capsule endoscopy technology. It can improve the anomaly detection rate, reduce physicians' time for screening, and aid in real-world clinical analysis. Computer-Aided diagnosis (CADx) classification system for video capsule endoscopy has shown a great promise for further improvement. For example, detection of cancerous polyp and bleeding can lead to swift medical response and improve the survival rate of the patients. To this end, an automated CADx system must have high throughput and decent accuracy. In this study, we propose <i>FocalConvNet</i>, a focal modulation network integrated with lightweight convolutional layers for the classification of small bowel anatomical landmarks and luminal findings. FocalConvNet leverages focal modulation to attain global context and allows global-local spatial interactions throughout the forward pass. Moreover, the convolutional block with its intrinsic inductive/learning bias and capacity to extract hierarchical features allows our FocalConvNet to achieve favourable results with high throughput. We compare our FocalConvNet with other state-of-the-art (SOTA) on Kvasir-Capsule, a large-scale VCE dataset with 44,228 frames with 13 classes of different anomalies. We achieved the weighted F1-score, recall and Matthews correlation coefficient (MCC) of 0.6734, 0.6373 and 0.2974, respectively, outperforming SOTA methodologies. Further, we obtained the highest throughput of 148.02 images/second rate to establish the potential of FocalConvNet in a real-time clinical environment. The code of the proposed FocalConvNet is available at https://github.com/NoviceMAn-prog/FocalConvNet.</p>","PeriodicalId":74567,"journal":{"name":"Proceedings. IEEE International Symposium on Computer-Based Medical Systems","volume":"2022 ","pages":"323-328"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9914988/pdf/nihms-1871537.pdf","citationCount":"5","resultStr":"{\"title\":\"Video Capsule Endoscopy Classification using Focal Modulation Guided Convolutional Neural Network.\",\"authors\":\"Abhishek Srivastava, Nikhil Kumar Tomar, Ulas Bagci, Debesh Jha\",\"doi\":\"10.1109/CBMS55023.2022.00064\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Video capsule endoscopy is a hot topic in computer vision and medicine. Deep learning can have a positive impact on the future of video capsule endoscopy technology. It can improve the anomaly detection rate, reduce physicians' time for screening, and aid in real-world clinical analysis. Computer-Aided diagnosis (CADx) classification system for video capsule endoscopy has shown a great promise for further improvement. For example, detection of cancerous polyp and bleeding can lead to swift medical response and improve the survival rate of the patients. To this end, an automated CADx system must have high throughput and decent accuracy. In this study, we propose <i>FocalConvNet</i>, a focal modulation network integrated with lightweight convolutional layers for the classification of small bowel anatomical landmarks and luminal findings. FocalConvNet leverages focal modulation to attain global context and allows global-local spatial interactions throughout the forward pass. Moreover, the convolutional block with its intrinsic inductive/learning bias and capacity to extract hierarchical features allows our FocalConvNet to achieve favourable results with high throughput. We compare our FocalConvNet with other state-of-the-art (SOTA) on Kvasir-Capsule, a large-scale VCE dataset with 44,228 frames with 13 classes of different anomalies. We achieved the weighted F1-score, recall and Matthews correlation coefficient (MCC) of 0.6734, 0.6373 and 0.2974, respectively, outperforming SOTA methodologies. Further, we obtained the highest throughput of 148.02 images/second rate to establish the potential of FocalConvNet in a real-time clinical environment. The code of the proposed FocalConvNet is available at https://github.com/NoviceMAn-prog/FocalConvNet.</p>\",\"PeriodicalId\":74567,\"journal\":{\"name\":\"Proceedings. IEEE International Symposium on Computer-Based Medical Systems\",\"volume\":\"2022 \",\"pages\":\"323-328\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9914988/pdf/nihms-1871537.pdf\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings. IEEE International Symposium on Computer-Based Medical Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CBMS55023.2022.00064\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. IEEE International Symposium on Computer-Based Medical Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CBMS55023.2022.00064","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Video Capsule Endoscopy Classification using Focal Modulation Guided Convolutional Neural Network.
Video capsule endoscopy is a hot topic in computer vision and medicine. Deep learning can have a positive impact on the future of video capsule endoscopy technology. It can improve the anomaly detection rate, reduce physicians' time for screening, and aid in real-world clinical analysis. Computer-Aided diagnosis (CADx) classification system for video capsule endoscopy has shown a great promise for further improvement. For example, detection of cancerous polyp and bleeding can lead to swift medical response and improve the survival rate of the patients. To this end, an automated CADx system must have high throughput and decent accuracy. In this study, we propose FocalConvNet, a focal modulation network integrated with lightweight convolutional layers for the classification of small bowel anatomical landmarks and luminal findings. FocalConvNet leverages focal modulation to attain global context and allows global-local spatial interactions throughout the forward pass. Moreover, the convolutional block with its intrinsic inductive/learning bias and capacity to extract hierarchical features allows our FocalConvNet to achieve favourable results with high throughput. We compare our FocalConvNet with other state-of-the-art (SOTA) on Kvasir-Capsule, a large-scale VCE dataset with 44,228 frames with 13 classes of different anomalies. We achieved the weighted F1-score, recall and Matthews correlation coefficient (MCC) of 0.6734, 0.6373 and 0.2974, respectively, outperforming SOTA methodologies. Further, we obtained the highest throughput of 148.02 images/second rate to establish the potential of FocalConvNet in a real-time clinical environment. The code of the proposed FocalConvNet is available at https://github.com/NoviceMAn-prog/FocalConvNet.