{"title":"用于深度辅助 UDA 语义分割的变换器框架","authors":"","doi":"10.1016/j.engappai.2024.109206","DOIUrl":null,"url":null,"abstract":"<div><p>Unsupervised domain adaptation (UDA) plays a crucial role in transferring models trained on synthetic datasets to real-world datasets. In semantic segmentation, UDA can alleviate the requirement of a large number of dense semantic annotations. Some UDA semantic segmentation approaches have already leveraged depth information to enhance semantic features for improved segmentation accuracy. Building on this, we introduce a UDA multitask Transformer framework called Multi-former. Multi-former contains a semantic-segmentation and a depth-estimation network. Depth-estimation network extracts more informative depth features to estimate depth and assist in semantic segmentation. In addition, considering the issue of imbalanced class pixel distributions in the source domain, we present a rare class mix strategy (RCM) to balance domain adaptability for all classes. To further enhance the UDA semantic segmentation performance, we design a mixed label loss weight strategy (MLW), which employs different types of weights to comprehensively utilize the features of pseudo-label. Experimental results demonstrate the effectiveness of the proposed approach, which achieves the best mean intersection over union (mIoU) of 56.1% and 76.3% on the two UDA benchmark tasks of synthetic datasets to real-world datasets, respectively. The code and models are available at <span><span>https://github.com/fz-ss/Multi-former</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":7.5000,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Transformer framework for depth-assisted UDA semantic segmentation\",\"authors\":\"\",\"doi\":\"10.1016/j.engappai.2024.109206\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Unsupervised domain adaptation (UDA) plays a crucial role in transferring models trained on synthetic datasets to real-world datasets. In semantic segmentation, UDA can alleviate the requirement of a large number of dense semantic annotations. Some UDA semantic segmentation approaches have already leveraged depth information to enhance semantic features for improved segmentation accuracy. Building on this, we introduce a UDA multitask Transformer framework called Multi-former. Multi-former contains a semantic-segmentation and a depth-estimation network. Depth-estimation network extracts more informative depth features to estimate depth and assist in semantic segmentation. In addition, considering the issue of imbalanced class pixel distributions in the source domain, we present a rare class mix strategy (RCM) to balance domain adaptability for all classes. To further enhance the UDA semantic segmentation performance, we design a mixed label loss weight strategy (MLW), which employs different types of weights to comprehensively utilize the features of pseudo-label. Experimental results demonstrate the effectiveness of the proposed approach, which achieves the best mean intersection over union (mIoU) of 56.1% and 76.3% on the two UDA benchmark tasks of synthetic datasets to real-world datasets, respectively. The code and models are available at <span><span>https://github.com/fz-ss/Multi-former</span><svg><path></path></svg></span>.</p></div>\",\"PeriodicalId\":50523,\"journal\":{\"name\":\"Engineering Applications of Artificial Intelligence\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2024-09-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Engineering Applications of Artificial Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0952197624013642\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering Applications of Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0952197624013642","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
Transformer framework for depth-assisted UDA semantic segmentation
Unsupervised domain adaptation (UDA) plays a crucial role in transferring models trained on synthetic datasets to real-world datasets. In semantic segmentation, UDA can alleviate the requirement of a large number of dense semantic annotations. Some UDA semantic segmentation approaches have already leveraged depth information to enhance semantic features for improved segmentation accuracy. Building on this, we introduce a UDA multitask Transformer framework called Multi-former. Multi-former contains a semantic-segmentation and a depth-estimation network. Depth-estimation network extracts more informative depth features to estimate depth and assist in semantic segmentation. In addition, considering the issue of imbalanced class pixel distributions in the source domain, we present a rare class mix strategy (RCM) to balance domain adaptability for all classes. To further enhance the UDA semantic segmentation performance, we design a mixed label loss weight strategy (MLW), which employs different types of weights to comprehensively utilize the features of pseudo-label. Experimental results demonstrate the effectiveness of the proposed approach, which achieves the best mean intersection over union (mIoU) of 56.1% and 76.3% on the two UDA benchmark tasks of synthetic datasets to real-world datasets, respectively. The code and models are available at https://github.com/fz-ss/Multi-former.
期刊介绍:
Artificial Intelligence (AI) is pivotal in driving the fourth industrial revolution, witnessing remarkable advancements across various machine learning methodologies. AI techniques have become indispensable tools for practicing engineers, enabling them to tackle previously insurmountable challenges. Engineering Applications of Artificial Intelligence serves as a global platform for the swift dissemination of research elucidating the practical application of AI methods across all engineering disciplines. Submitted papers are expected to present novel aspects of AI utilized in real-world engineering applications, validated using publicly available datasets to ensure the replicability of research outcomes. Join us in exploring the transformative potential of AI in engineering.