Malvi Mungalpara, Priyanka Goradia, Trisha Baldha, Yanvi Soni
{"title":"用于场景理解的深度卷积神经网络:语义分割模型的研究","authors":"Malvi Mungalpara, Priyanka Goradia, Trisha Baldha, Yanvi Soni","doi":"10.1109/aimv53313.2021.9670955","DOIUrl":null,"url":null,"abstract":"Semantic Image Segmentation for autonomous cars is gaining a lot of popularity in recent times with researchers trying to improvise the model as much as possible. In this paper, we have compared three models, UNet, VGG16_FCN and ResNet50_FCN, which are used for semantic image segmentation. We have trained and tested these models on the cityscape dataset where the models classify each pixel of the image into various classes. Results show that the class-wise accuracy of ResNet50_FCN is more than the other two models. We have also plotted IoU graphs for each model and we found out that ResNet50_FCN and VGG16_FCN have much better scores than the UNet model. Based on these results, we have shown that ResNet50_FCN outperforms the other two models for the case of semantic segmentation for scene understanding.","PeriodicalId":135318,"journal":{"name":"2021 International Conference on Artificial Intelligence and Machine Vision (AIMV)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Deep Convolutional Neural Networks for Scene Understanding: A Study of Semantic Segmentation Models\",\"authors\":\"Malvi Mungalpara, Priyanka Goradia, Trisha Baldha, Yanvi Soni\",\"doi\":\"10.1109/aimv53313.2021.9670955\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Semantic Image Segmentation for autonomous cars is gaining a lot of popularity in recent times with researchers trying to improvise the model as much as possible. In this paper, we have compared three models, UNet, VGG16_FCN and ResNet50_FCN, which are used for semantic image segmentation. We have trained and tested these models on the cityscape dataset where the models classify each pixel of the image into various classes. Results show that the class-wise accuracy of ResNet50_FCN is more than the other two models. We have also plotted IoU graphs for each model and we found out that ResNet50_FCN and VGG16_FCN have much better scores than the UNet model. Based on these results, we have shown that ResNet50_FCN outperforms the other two models for the case of semantic segmentation for scene understanding.\",\"PeriodicalId\":135318,\"journal\":{\"name\":\"2021 International Conference on Artificial Intelligence and Machine Vision (AIMV)\",\"volume\":\"29 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-09-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 International Conference on Artificial Intelligence and Machine Vision (AIMV)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/aimv53313.2021.9670955\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Artificial Intelligence and Machine Vision (AIMV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/aimv53313.2021.9670955","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Deep Convolutional Neural Networks for Scene Understanding: A Study of Semantic Segmentation Models
Semantic Image Segmentation for autonomous cars is gaining a lot of popularity in recent times with researchers trying to improvise the model as much as possible. In this paper, we have compared three models, UNet, VGG16_FCN and ResNet50_FCN, which are used for semantic image segmentation. We have trained and tested these models on the cityscape dataset where the models classify each pixel of the image into various classes. Results show that the class-wise accuracy of ResNet50_FCN is more than the other two models. We have also plotted IoU graphs for each model and we found out that ResNet50_FCN and VGG16_FCN have much better scores than the UNet model. Based on these results, we have shown that ResNet50_FCN outperforms the other two models for the case of semantic segmentation for scene understanding.