{"title":"预激活语义信息用于图像审美评价","authors":"J. Song, Rong Huang, Yujia Tian, Aihua Dong","doi":"10.1177/24723444221147971","DOIUrl":null,"url":null,"abstract":"Automatic image aesthetic evaluation is an attractive and challenging visual task. Recently, methods based on convolutional neural networks have achieved remarkable performance. However, semantic information, an intuitive prerequisite for evaluating image aesthetics, has not received enough attention regarding its importance in previous methods. How to efficiently extract semantic information and make better use of it to assist the aesthetic evaluation task remains unsolved. In this article, we propose to utilize the self-supervised model Auto-Encoder to extract semantic information in the form of multi-task learning. Then, a fusing module is prepended at the bottleneck layer to explicitly combine semantic information with aesthetic information in a pre-activated manner. Specifically, we implement a customized pooling operation to pool the semantic features extracted by Auto-Encoder and apply a weak constraint between the pooled semantic features and aesthetic information to realize the combination. The following regressor can complete aesthetic evaluation based on the semantic–aesthetic combined features. In addition, to enable our model to adapt to arbitrary aspect ratios of images, another pooling strategy called spatial pyramid pooling is adopted to obtain the image features of a fixed length. Our method achieves competitive performance on the public image aesthetic evaluation benchmark. Especially on the most commonly used metric Spearman rank-order correlation coefficient, the proposed model achieved the best performance compared with some state-of-the-art methods. Extensive ablation studies and visualization experiments were conducted to demonstrate the effectiveness of our method.","PeriodicalId":6955,"journal":{"name":"AATCC Journal of Research","volume":" ","pages":""},"PeriodicalIF":0.6000,"publicationDate":"2023-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Pre-Activating Semantic Information for Image Aesthetic Assessment\",\"authors\":\"J. Song, Rong Huang, Yujia Tian, Aihua Dong\",\"doi\":\"10.1177/24723444221147971\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Automatic image aesthetic evaluation is an attractive and challenging visual task. Recently, methods based on convolutional neural networks have achieved remarkable performance. However, semantic information, an intuitive prerequisite for evaluating image aesthetics, has not received enough attention regarding its importance in previous methods. How to efficiently extract semantic information and make better use of it to assist the aesthetic evaluation task remains unsolved. In this article, we propose to utilize the self-supervised model Auto-Encoder to extract semantic information in the form of multi-task learning. Then, a fusing module is prepended at the bottleneck layer to explicitly combine semantic information with aesthetic information in a pre-activated manner. Specifically, we implement a customized pooling operation to pool the semantic features extracted by Auto-Encoder and apply a weak constraint between the pooled semantic features and aesthetic information to realize the combination. The following regressor can complete aesthetic evaluation based on the semantic–aesthetic combined features. In addition, to enable our model to adapt to arbitrary aspect ratios of images, another pooling strategy called spatial pyramid pooling is adopted to obtain the image features of a fixed length. Our method achieves competitive performance on the public image aesthetic evaluation benchmark. Especially on the most commonly used metric Spearman rank-order correlation coefficient, the proposed model achieved the best performance compared with some state-of-the-art methods. Extensive ablation studies and visualization experiments were conducted to demonstrate the effectiveness of our method.\",\"PeriodicalId\":6955,\"journal\":{\"name\":\"AATCC Journal of Research\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.6000,\"publicationDate\":\"2023-02-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"AATCC Journal of Research\",\"FirstCategoryId\":\"88\",\"ListUrlMain\":\"https://doi.org/10.1177/24723444221147971\",\"RegionNum\":4,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"MATERIALS SCIENCE, TEXTILES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"AATCC Journal of Research","FirstCategoryId":"88","ListUrlMain":"https://doi.org/10.1177/24723444221147971","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"MATERIALS SCIENCE, TEXTILES","Score":null,"Total":0}
Pre-Activating Semantic Information for Image Aesthetic Assessment
Automatic image aesthetic evaluation is an attractive and challenging visual task. Recently, methods based on convolutional neural networks have achieved remarkable performance. However, semantic information, an intuitive prerequisite for evaluating image aesthetics, has not received enough attention regarding its importance in previous methods. How to efficiently extract semantic information and make better use of it to assist the aesthetic evaluation task remains unsolved. In this article, we propose to utilize the self-supervised model Auto-Encoder to extract semantic information in the form of multi-task learning. Then, a fusing module is prepended at the bottleneck layer to explicitly combine semantic information with aesthetic information in a pre-activated manner. Specifically, we implement a customized pooling operation to pool the semantic features extracted by Auto-Encoder and apply a weak constraint between the pooled semantic features and aesthetic information to realize the combination. The following regressor can complete aesthetic evaluation based on the semantic–aesthetic combined features. In addition, to enable our model to adapt to arbitrary aspect ratios of images, another pooling strategy called spatial pyramid pooling is adopted to obtain the image features of a fixed length. Our method achieves competitive performance on the public image aesthetic evaluation benchmark. Especially on the most commonly used metric Spearman rank-order correlation coefficient, the proposed model achieved the best performance compared with some state-of-the-art methods. Extensive ablation studies and visualization experiments were conducted to demonstrate the effectiveness of our method.
期刊介绍:
AATCC Journal of Research. This textile research journal has a broad scope: from advanced materials, fibers, and textile and polymer chemistry, to color science, apparel design, and sustainability.
Now indexed by Science Citation Index Extended (SCIE) and discoverable in the Clarivate Analytics Web of Science Core Collection! The Journal’s impact factor is available in Journal Citation Reports.