{"title":"Prompt-guided image color aesthetics assessment: Models, datasets and benchmarks","authors":"Shuai He , Yi Xiao , Anlong Ming, Huadong Ma","doi":"10.1016/j.inffus.2024.102706","DOIUrl":null,"url":null,"abstract":"<div><div>Image color aesthetics assessment (ICAA) aims to assess color aesthetics based on human perception, which is crucial for various applications such as imaging measurement and image analysis. The ceiling of previous methods is constrained to a holistic evaluation approach, which hinders their ability to offer explainability from multiple perspectives. Moreover, existing ICAA datasets often lack multi-attribute annotations beyond holistic scores, which are necessary to provide effective supervision for training or validating models’ multi-perspective assessment capabilities, thereby hindering their capacity for effective generalization. To advance ICAA research, (1) we propose an “all-in-one” model called the Prompt-Guided Delegate Transformer (Prompt-DeT). Prompt-DeT utilizes dedicated prompt strategies and an Aesthetic Adapter (Aes-Adapter), to exploit the rich visual language prior embedded in large pre-trained vision-language models. It enhances the model’s perception of multiple attributes, enabling impressive zero-shot and fine-tuning capabilities on sub-attribute tasks, and even supports user-customized scenarios. (2) We elaborately construct a color-oriented dataset, ICAA20K, containing 20K images and 6 annotated dimensions to support both holistic and sub-attribute ICAA tasks. (3) We develop a comprehensive benchmark comprising of 17 methods, which is the most extensive to date, based on four datasets (ICAA20K, ICAA17K, SPAQ, and PARA) for evaluating the holistic and sub-attribute performance of ICAA methods. Our work, not only achieves state-of-the-art (SOTA) performance, but also offers the community a roadmap to explore solutions for ICAA. The code and dataset are available <span><span>https://github.com/woshidandan/Image-Color-Aesthetics-Assessment/blob/main/Refine-for-ICAA.md</span><svg><path></path></svg></span> here.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"114 ","pages":"Article 102706"},"PeriodicalIF":14.7000,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1566253524004846","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Image color aesthetics assessment (ICAA) aims to assess color aesthetics based on human perception, which is crucial for various applications such as imaging measurement and image analysis. The ceiling of previous methods is constrained to a holistic evaluation approach, which hinders their ability to offer explainability from multiple perspectives. Moreover, existing ICAA datasets often lack multi-attribute annotations beyond holistic scores, which are necessary to provide effective supervision for training or validating models’ multi-perspective assessment capabilities, thereby hindering their capacity for effective generalization. To advance ICAA research, (1) we propose an “all-in-one” model called the Prompt-Guided Delegate Transformer (Prompt-DeT). Prompt-DeT utilizes dedicated prompt strategies and an Aesthetic Adapter (Aes-Adapter), to exploit the rich visual language prior embedded in large pre-trained vision-language models. It enhances the model’s perception of multiple attributes, enabling impressive zero-shot and fine-tuning capabilities on sub-attribute tasks, and even supports user-customized scenarios. (2) We elaborately construct a color-oriented dataset, ICAA20K, containing 20K images and 6 annotated dimensions to support both holistic and sub-attribute ICAA tasks. (3) We develop a comprehensive benchmark comprising of 17 methods, which is the most extensive to date, based on four datasets (ICAA20K, ICAA17K, SPAQ, and PARA) for evaluating the holistic and sub-attribute performance of ICAA methods. Our work, not only achieves state-of-the-art (SOTA) performance, but also offers the community a roadmap to explore solutions for ICAA. The code and dataset are available https://github.com/woshidandan/Image-Color-Aesthetics-Assessment/blob/main/Refine-for-ICAA.md here.
期刊介绍:
Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.