Prompt-guided image color aesthetics assessment: Models, datasets and benchmarks

IF 14.7 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Shuai He , Yi Xiao , Anlong Ming, Huadong Ma
{"title":"Prompt-guided image color aesthetics assessment: Models, datasets and benchmarks","authors":"Shuai He ,&nbsp;Yi Xiao ,&nbsp;Anlong Ming,&nbsp;Huadong Ma","doi":"10.1016/j.inffus.2024.102706","DOIUrl":null,"url":null,"abstract":"<div><div>Image color aesthetics assessment (ICAA) aims to assess color aesthetics based on human perception, which is crucial for various applications such as imaging measurement and image analysis. The ceiling of previous methods is constrained to a holistic evaluation approach, which hinders their ability to offer explainability from multiple perspectives. Moreover, existing ICAA datasets often lack multi-attribute annotations beyond holistic scores, which are necessary to provide effective supervision for training or validating models’ multi-perspective assessment capabilities, thereby hindering their capacity for effective generalization. To advance ICAA research, (1) we propose an “all-in-one” model called the Prompt-Guided Delegate Transformer (Prompt-DeT). Prompt-DeT utilizes dedicated prompt strategies and an Aesthetic Adapter (Aes-Adapter), to exploit the rich visual language prior embedded in large pre-trained vision-language models. It enhances the model’s perception of multiple attributes, enabling impressive zero-shot and fine-tuning capabilities on sub-attribute tasks, and even supports user-customized scenarios. (2) We elaborately construct a color-oriented dataset, ICAA20K, containing 20K images and 6 annotated dimensions to support both holistic and sub-attribute ICAA tasks. (3) We develop a comprehensive benchmark comprising of 17 methods, which is the most extensive to date, based on four datasets (ICAA20K, ICAA17K, SPAQ, and PARA) for evaluating the holistic and sub-attribute performance of ICAA methods. Our work, not only achieves state-of-the-art (SOTA) performance, but also offers the community a roadmap to explore solutions for ICAA. The code and dataset are available <span><span>https://github.com/woshidandan/Image-Color-Aesthetics-Assessment/blob/main/Refine-for-ICAA.md</span><svg><path></path></svg></span> here.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"114 ","pages":"Article 102706"},"PeriodicalIF":14.7000,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1566253524004846","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Image color aesthetics assessment (ICAA) aims to assess color aesthetics based on human perception, which is crucial for various applications such as imaging measurement and image analysis. The ceiling of previous methods is constrained to a holistic evaluation approach, which hinders their ability to offer explainability from multiple perspectives. Moreover, existing ICAA datasets often lack multi-attribute annotations beyond holistic scores, which are necessary to provide effective supervision for training or validating models’ multi-perspective assessment capabilities, thereby hindering their capacity for effective generalization. To advance ICAA research, (1) we propose an “all-in-one” model called the Prompt-Guided Delegate Transformer (Prompt-DeT). Prompt-DeT utilizes dedicated prompt strategies and an Aesthetic Adapter (Aes-Adapter), to exploit the rich visual language prior embedded in large pre-trained vision-language models. It enhances the model’s perception of multiple attributes, enabling impressive zero-shot and fine-tuning capabilities on sub-attribute tasks, and even supports user-customized scenarios. (2) We elaborately construct a color-oriented dataset, ICAA20K, containing 20K images and 6 annotated dimensions to support both holistic and sub-attribute ICAA tasks. (3) We develop a comprehensive benchmark comprising of 17 methods, which is the most extensive to date, based on four datasets (ICAA20K, ICAA17K, SPAQ, and PARA) for evaluating the holistic and sub-attribute performance of ICAA methods. Our work, not only achieves state-of-the-art (SOTA) performance, but also offers the community a roadmap to explore solutions for ICAA. The code and dataset are available https://github.com/woshidandan/Image-Color-Aesthetics-Assessment/blob/main/Refine-for-ICAA.md here.
提示引导的图像色彩美学评估:模型、数据集和基准
图像色彩美学评估(ICAA)旨在基于人类感知评估色彩美学,这对于成像测量和图像分析等各种应用至关重要。以往方法的上限局限于整体评估方法,这阻碍了它们从多个角度提供可解释性的能力。此外,现有的 ICAA 数据集往往缺乏整体评分之外的多属性注释,而这些注释是为训练或验证模型的多角度评估能力提供有效监督所必需的,从而阻碍了模型的有效泛化能力。为了推进国际遗传评估研究,(1) 我们提出了一种 "一体化 "模型,称为 "提示引导委托转换器"(Prompt-DeT)。Prompt-DeT 利用专用的提示策略和美学适配器(Aes-Adapter),利用预先训练好的大型视觉语言模型中蕴含的丰富视觉语言先验。它增强了模型对多种属性的感知能力,在子属性任务上实现了令人印象深刻的归零和微调能力,甚至支持用户自定义场景。(2) 我们精心构建了一个面向颜色的数据集 ICAA20K,其中包含 20K 幅图像和 6 个注释维度,以支持整体和子属性 ICAA 任务。(3) 我们在四个数据集(ICAA20K、ICAA17K、SPAQ 和 PARA)的基础上开发了一个包含 17 种方法的综合基准,这是迄今为止最广泛的基准,用于评估 ICAA 方法的整体和子属性性能。我们的工作不仅实现了最先进的(SOTA)性能,还为社区提供了探索 ICAA 解决方案的路线图。代码和数据集可在 https://github.com/woshidandan/Image-Color-Aesthetics-Assessment/blob/main/Refine-for-ICAA.md 这里获取。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Information Fusion
Information Fusion 工程技术-计算机:理论方法
CiteScore
33.20
自引率
4.30%
发文量
161
审稿时长
7.9 months
期刊介绍: Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信