Tingqiang Deng, Rui Li, Chunguo Li, Rutian Liao, Yang Liu, Zhen Yang, Luxi Yang
{"title":"面向实时任务的细粒度视觉分类的多尺度双线性卷积神经网络","authors":"Tingqiang Deng, Rui Li, Chunguo Li, Rutian Liao, Yang Liu, Zhen Yang, Luxi Yang","doi":"10.1117/12.2540365","DOIUrl":null,"url":null,"abstract":"Fine-grained visual classification (FGVC) is difficult due to the under-utilization of low-level features. This paper proposes a real-time method MBNet based on multi-stream multi-scale cross bilinear CNN that contributes to solving the problem. First, each layer of the multi-stream CNN is extracted by basic network such as VGGNet and others, followed by calculating multi-stream cross bilinear vector and bottom bilinear vector of low and high level features respectively. The FGVC results are predicted after feature fusion, which solves the problem that small and low-level details in the original image are easily overlooked. In the widely used datasets Caltech-UCSD Birds, Stanford Cars and Aircraft, the proposed method shows that the accuracy is significantly improved compared to the existing methods, reaching to state of the art level of 88.51%, 94.73% and 92.41%. It also meets the requirements of real-time tasks.","PeriodicalId":90079,"journal":{"name":"... International Workshop on Pattern Recognition in NeuroImaging. International Workshop on Pattern Recognition in NeuroImaging","volume":"2 1","pages":"1119806 - 1119806-6"},"PeriodicalIF":0.0000,"publicationDate":"2019-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"MBNet: multi-scale bilinear convolutional neural networks for fine-grained visual classification towards real-time tasks\",\"authors\":\"Tingqiang Deng, Rui Li, Chunguo Li, Rutian Liao, Yang Liu, Zhen Yang, Luxi Yang\",\"doi\":\"10.1117/12.2540365\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Fine-grained visual classification (FGVC) is difficult due to the under-utilization of low-level features. This paper proposes a real-time method MBNet based on multi-stream multi-scale cross bilinear CNN that contributes to solving the problem. First, each layer of the multi-stream CNN is extracted by basic network such as VGGNet and others, followed by calculating multi-stream cross bilinear vector and bottom bilinear vector of low and high level features respectively. The FGVC results are predicted after feature fusion, which solves the problem that small and low-level details in the original image are easily overlooked. In the widely used datasets Caltech-UCSD Birds, Stanford Cars and Aircraft, the proposed method shows that the accuracy is significantly improved compared to the existing methods, reaching to state of the art level of 88.51%, 94.73% and 92.41%. It also meets the requirements of real-time tasks.\",\"PeriodicalId\":90079,\"journal\":{\"name\":\"... International Workshop on Pattern Recognition in NeuroImaging. International Workshop on Pattern Recognition in NeuroImaging\",\"volume\":\"2 1\",\"pages\":\"1119806 - 1119806-6\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-07-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"... International Workshop on Pattern Recognition in NeuroImaging. International Workshop on Pattern Recognition in NeuroImaging\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1117/12.2540365\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"... International Workshop on Pattern Recognition in NeuroImaging. International Workshop on Pattern Recognition in NeuroImaging","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1117/12.2540365","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
MBNet: multi-scale bilinear convolutional neural networks for fine-grained visual classification towards real-time tasks
Fine-grained visual classification (FGVC) is difficult due to the under-utilization of low-level features. This paper proposes a real-time method MBNet based on multi-stream multi-scale cross bilinear CNN that contributes to solving the problem. First, each layer of the multi-stream CNN is extracted by basic network such as VGGNet and others, followed by calculating multi-stream cross bilinear vector and bottom bilinear vector of low and high level features respectively. The FGVC results are predicted after feature fusion, which solves the problem that small and low-level details in the original image are easily overlooked. In the widely used datasets Caltech-UCSD Birds, Stanford Cars and Aircraft, the proposed method shows that the accuracy is significantly improved compared to the existing methods, reaching to state of the art level of 88.51%, 94.73% and 92.41%. It also meets the requirements of real-time tasks.