{"title":"Domain adaptive YOLO based on image style selection and synergistic domain classifier","authors":"Yipeng Zhou, Huaming Qian","doi":"10.1016/j.displa.2025.102967","DOIUrl":null,"url":null,"abstract":"<div><div>Object detectors are trained on routine datasets that are primarily obtained under suitable conditions, yet will encounter various extreme environments in the complex real-world. Distribution shift in the train and test datasets poses serious damage to the performance of models, the most cost-effective means of solving this problem is unsupervised domain adaptive (UDA) method. In this work, we use YOLOv8 as underlying detector to construct a domain adaptive framework called YOLO-SDCoN, which offers a new solution paradigm for the domain shift problem. Specifically, we propose an Synergistic Domain Classifier (SDC) with richer gradient flow, which takes all the multi-scale features used for detection as inputs, providing a more adequate way to generate domain-invariant features while eliminating the gradient vanishing phenomenon. Furthermore, a novel Batch-Instance Co-Normalization (BI-CoN) method is proposed, which enables adaptive selection and preservation of image styles under the implicit guidance of a domain classifier, thereby generating better domain-invariant features to enhance the robustness of cross-domain detection. We conducted extensive experiments on KITTI, Cityscapes, Foggy Cityscapes, and SIM10K datasets. The results show that the proposed YOLO-SDCoN is comprehensively superior to the Faster R-CNN based domain adaptive frameworks, and achieves superior results compared to other methods.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"87 ","pages":"Article 102967"},"PeriodicalIF":3.7000,"publicationDate":"2025-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Displays","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0141938225000046","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
Object detectors are trained on routine datasets that are primarily obtained under suitable conditions, yet will encounter various extreme environments in the complex real-world. Distribution shift in the train and test datasets poses serious damage to the performance of models, the most cost-effective means of solving this problem is unsupervised domain adaptive (UDA) method. In this work, we use YOLOv8 as underlying detector to construct a domain adaptive framework called YOLO-SDCoN, which offers a new solution paradigm for the domain shift problem. Specifically, we propose an Synergistic Domain Classifier (SDC) with richer gradient flow, which takes all the multi-scale features used for detection as inputs, providing a more adequate way to generate domain-invariant features while eliminating the gradient vanishing phenomenon. Furthermore, a novel Batch-Instance Co-Normalization (BI-CoN) method is proposed, which enables adaptive selection and preservation of image styles under the implicit guidance of a domain classifier, thereby generating better domain-invariant features to enhance the robustness of cross-domain detection. We conducted extensive experiments on KITTI, Cityscapes, Foggy Cityscapes, and SIM10K datasets. The results show that the proposed YOLO-SDCoN is comprehensively superior to the Faster R-CNN based domain adaptive frameworks, and achieves superior results compared to other methods.
期刊介绍:
Displays is the international journal covering the research and development of display technology, its effective presentation and perception of information, and applications and systems including display-human interface.
Technical papers on practical developments in Displays technology provide an effective channel to promote greater understanding and cross-fertilization across the diverse disciplines of the Displays community. Original research papers solving ergonomics issues at the display-human interface advance effective presentation of information. Tutorial papers covering fundamentals intended for display technologies and human factor engineers new to the field will also occasionally featured.