Nadeem Atif, H. Balaji, Saquib Mazhar, S. R. Ahamad, M. Bhuyan
{"title":"语义掩蔽:一种缓解实时语义分割中类不平衡问题的新技术","authors":"Nadeem Atif, H. Balaji, Saquib Mazhar, S. R. Ahamad, M. Bhuyan","doi":"10.1109/NCC55593.2022.9806776","DOIUrl":null,"url":null,"abstract":"In the field of computer vision and scene under-standing, semantic segmentation is considered to be one of the most challenging task. This is due to the fact that it has to solve all the three standard vision problems, multi-class classification, object detection and image segmentation. One of the most promising areas of application of semantic segmentation is autonomous driving. The advent of deep-learning and the availability of large-scale datasets has enabled the research com-munity to reach to unprecedented performance heights compared to traditional machine learning algorithms. However, despite all the progress, existing learning methods still face the problem of class-imbalance because of which large classes get more attention and consequently the network becomes biased towards them. The problem of class-imbalance is particularly more prominent in urban road-scene datasets. This is because the layout of the scene captured by the camera mounted on a fixed location, naturally causes certain less important classes to occupy more area in the dataset. Trees, sky and buildings are some of the examples of large classes which frequently occur and occupy large areas despite the fact that they are less important with regards to driving related decision making. To tackle this problem, in this work, we have done the statistical analysis of the famous Cityscapes dataset to uncover the hidden patterns that large and small classes follow. Based on these patterns, we propose a semantic masking technique, that enables our proposed network MaskNet to pay more attention to regions where the smaller classes are more likely to occur. In this way, we see a significant performance increase with regards to smaller classes and the problem of class-imbalance is mitigated to a good extent.","PeriodicalId":403870,"journal":{"name":"2022 National Conference on Communications (NCC)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Semantic Masking: A Novel Technique to Mitigate the Class-Imbalance Problem in Real-Time Semantic Segmentation\",\"authors\":\"Nadeem Atif, H. Balaji, Saquib Mazhar, S. R. Ahamad, M. Bhuyan\",\"doi\":\"10.1109/NCC55593.2022.9806776\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In the field of computer vision and scene under-standing, semantic segmentation is considered to be one of the most challenging task. This is due to the fact that it has to solve all the three standard vision problems, multi-class classification, object detection and image segmentation. One of the most promising areas of application of semantic segmentation is autonomous driving. The advent of deep-learning and the availability of large-scale datasets has enabled the research com-munity to reach to unprecedented performance heights compared to traditional machine learning algorithms. However, despite all the progress, existing learning methods still face the problem of class-imbalance because of which large classes get more attention and consequently the network becomes biased towards them. The problem of class-imbalance is particularly more prominent in urban road-scene datasets. This is because the layout of the scene captured by the camera mounted on a fixed location, naturally causes certain less important classes to occupy more area in the dataset. Trees, sky and buildings are some of the examples of large classes which frequently occur and occupy large areas despite the fact that they are less important with regards to driving related decision making. To tackle this problem, in this work, we have done the statistical analysis of the famous Cityscapes dataset to uncover the hidden patterns that large and small classes follow. Based on these patterns, we propose a semantic masking technique, that enables our proposed network MaskNet to pay more attention to regions where the smaller classes are more likely to occur. In this way, we see a significant performance increase with regards to smaller classes and the problem of class-imbalance is mitigated to a good extent.\",\"PeriodicalId\":403870,\"journal\":{\"name\":\"2022 National Conference on Communications (NCC)\",\"volume\":\"2 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 National Conference on Communications (NCC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/NCC55593.2022.9806776\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 National Conference on Communications (NCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NCC55593.2022.9806776","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Semantic Masking: A Novel Technique to Mitigate the Class-Imbalance Problem in Real-Time Semantic Segmentation
In the field of computer vision and scene under-standing, semantic segmentation is considered to be one of the most challenging task. This is due to the fact that it has to solve all the three standard vision problems, multi-class classification, object detection and image segmentation. One of the most promising areas of application of semantic segmentation is autonomous driving. The advent of deep-learning and the availability of large-scale datasets has enabled the research com-munity to reach to unprecedented performance heights compared to traditional machine learning algorithms. However, despite all the progress, existing learning methods still face the problem of class-imbalance because of which large classes get more attention and consequently the network becomes biased towards them. The problem of class-imbalance is particularly more prominent in urban road-scene datasets. This is because the layout of the scene captured by the camera mounted on a fixed location, naturally causes certain less important classes to occupy more area in the dataset. Trees, sky and buildings are some of the examples of large classes which frequently occur and occupy large areas despite the fact that they are less important with regards to driving related decision making. To tackle this problem, in this work, we have done the statistical analysis of the famous Cityscapes dataset to uncover the hidden patterns that large and small classes follow. Based on these patterns, we propose a semantic masking technique, that enables our proposed network MaskNet to pay more attention to regions where the smaller classes are more likely to occur. In this way, we see a significant performance increase with regards to smaller classes and the problem of class-imbalance is mitigated to a good extent.