{"title":"Accelerated Self-Supervised Multi-Illumination Color Constancy with Hybrid Knowledge Distillation.","authors":"Ziyu Feng,Bing Li,Congyan Lang,Zheming Xu,Haina Qin,Juan Wang,Weihua Xiong","doi":"10.1109/tpami.2025.3583090","DOIUrl":null,"url":null,"abstract":"Color constancy, the human visual system's ability to perceive consistent colors under varying illumination conditions, is crucial for accurate color perception. Recently, deep learning algorithms have been introduced into this task and have achieved remarkable achievements. However, existing methods are limited by the scale of current multi-illumination datasets and model size, hindering their ability to learn discriminative features effectively and their practical value for deployment in cameras. To overcome these limitations, this paper proposes a multi-illumination color constancy approach based on self-supervised learning and knowledge distillation. This approach includes three phases: self-supervised pre-training, supervised fine-tuning, and knowledge distillation. During the pre-training phase, we train Transformer-based and U-Net based encoders by two pretext tasks: light normalization task to learn lighting color contextual representation and grayscale colorization task to acquire objects' inherent color information. For the downstream color constancy task, we fine-tune the encoders and design a lightweight decoder to obtain better illumination distributions with fewer parameters. During the knowledge distillation phase, we introduce a hybrid knowledge distillation technique to align CNN features with those of Transformer and U-Net respectively. Our proposed method outperforms state-of-the-art techniques on multi-illumination and single-illumination benchmarks. Extensive ablation studies and visualizations confirm the effectiveness of our model.","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"63 1","pages":""},"PeriodicalIF":20.8000,"publicationDate":"2025-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Pattern Analysis and Machine Intelligence","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1109/tpami.2025.3583090","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Color constancy, the human visual system's ability to perceive consistent colors under varying illumination conditions, is crucial for accurate color perception. Recently, deep learning algorithms have been introduced into this task and have achieved remarkable achievements. However, existing methods are limited by the scale of current multi-illumination datasets and model size, hindering their ability to learn discriminative features effectively and their practical value for deployment in cameras. To overcome these limitations, this paper proposes a multi-illumination color constancy approach based on self-supervised learning and knowledge distillation. This approach includes three phases: self-supervised pre-training, supervised fine-tuning, and knowledge distillation. During the pre-training phase, we train Transformer-based and U-Net based encoders by two pretext tasks: light normalization task to learn lighting color contextual representation and grayscale colorization task to acquire objects' inherent color information. For the downstream color constancy task, we fine-tune the encoders and design a lightweight decoder to obtain better illumination distributions with fewer parameters. During the knowledge distillation phase, we introduce a hybrid knowledge distillation technique to align CNN features with those of Transformer and U-Net respectively. Our proposed method outperforms state-of-the-art techniques on multi-illumination and single-illumination benchmarks. Extensive ablation studies and visualizations confirm the effectiveness of our model.
期刊介绍:
The IEEE Transactions on Pattern Analysis and Machine Intelligence publishes articles on all traditional areas of computer vision and image understanding, all traditional areas of pattern analysis and recognition, and selected areas of machine intelligence, with a particular emphasis on machine learning for pattern analysis. Areas such as techniques for visual search, document and handwriting analysis, medical image analysis, video and image sequence analysis, content-based retrieval of image and video, face and gesture recognition and relevant specialized hardware and/or software architectures are also covered.