Masud Ahmed, Zahid Hasan, Tim Yingling, Eric O'Leary, S. Purushotham, Suya You, Nirmalya Roy
{"title":"An Online Continuous Semantic Segmentation Framework With Minimal Labeling Efforts","authors":"Masud Ahmed, Zahid Hasan, Tim Yingling, Eric O'Leary, S. Purushotham, Suya You, Nirmalya Roy","doi":"10.1109/SMARTCOMP58114.2023.00032","DOIUrl":null,"url":null,"abstract":"The annotation load for a new dataset has been greatly decreased using domain adaptation based semantic segmentation, which iteratively constructs pseudo labels on unlabeled target data and retrains the network. However, realistic segmentation datasets are often imbalanced, with pseudo-labels tending to favor certain \"head\" classes while neglecting other \"tail\" classes. This can lead to an inaccurate and noisy mask. To address this issue, we propose a novel hard sample mining strategy for an active domain adaptation based semantic segmentation network, with the aim of automatically selecting a small subset of labeled target data to fine-tune the network. By calculating class-wise entropy, we are able to rank the difficulty level of different samples. We use a fusion of focal loss and regional mutual information loss instead of cross-entropy loss for the domain adaptation based semantic segmentation network. Our entire framework has been implemented in real-time using the Robotics Operating System (ROS) with a server PC and a small Unmanned Ground Vehicle (UGV) known as the ROSbot2.0 Pro. This implementation allows ROSbot2.0 Pro to access any type of data at any time, enabling it to perform a variety of tasks with ease. Our approach has been thoroughly evaluated through a series of extensive experiments, which demonstrate its superior performance compared to existing state-of-the-art methods. Remarkably, by using just 20% of hard samples for fine-tuning, our network has achieved a level of performance that is comparable (≈88%) to that of a fully supervised approach, with mIOU scores of 60.51% in the In-house dataset.","PeriodicalId":163556,"journal":{"name":"2023 IEEE International Conference on Smart Computing (SMARTCOMP)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE International Conference on Smart Computing (SMARTCOMP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SMARTCOMP58114.2023.00032","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The annotation load for a new dataset has been greatly decreased using domain adaptation based semantic segmentation, which iteratively constructs pseudo labels on unlabeled target data and retrains the network. However, realistic segmentation datasets are often imbalanced, with pseudo-labels tending to favor certain "head" classes while neglecting other "tail" classes. This can lead to an inaccurate and noisy mask. To address this issue, we propose a novel hard sample mining strategy for an active domain adaptation based semantic segmentation network, with the aim of automatically selecting a small subset of labeled target data to fine-tune the network. By calculating class-wise entropy, we are able to rank the difficulty level of different samples. We use a fusion of focal loss and regional mutual information loss instead of cross-entropy loss for the domain adaptation based semantic segmentation network. Our entire framework has been implemented in real-time using the Robotics Operating System (ROS) with a server PC and a small Unmanned Ground Vehicle (UGV) known as the ROSbot2.0 Pro. This implementation allows ROSbot2.0 Pro to access any type of data at any time, enabling it to perform a variety of tasks with ease. Our approach has been thoroughly evaluated through a series of extensive experiments, which demonstrate its superior performance compared to existing state-of-the-art methods. Remarkably, by using just 20% of hard samples for fine-tuning, our network has achieved a level of performance that is comparable (≈88%) to that of a fully supervised approach, with mIOU scores of 60.51% in the In-house dataset.