Yixiao Li , Xiaoyuan Yang , Yuqing Luo , Hadi Amirpour , Hantao Liu , Wei Zhou
{"title":"解锁隐式运动评估图像复杂性","authors":"Yixiao Li , Xiaoyuan Yang , Yuqing Luo , Hadi Amirpour , Hantao Liu , Wei Zhou","doi":"10.1016/j.displa.2025.103131","DOIUrl":null,"url":null,"abstract":"<div><div>Image complexity (IC) plays a critical role in both cognitive science and multimedia computing, influencing visual aesthetics, emotional responses, and tasks such as image classification and enhancement. However, defining and quantifying IC remains challenging due to its multifaceted nature, which encompasses both objective attributes (e.g., detail, structure) and subjective human perception. While traditional methods rely on entropy-based or multidimensional approaches, and recent advances employ machine learning and shallow neural networks, these techniques often fail to fully capture the subjective aspects of IC. Inspired by the fact that the human visual system inherently perceives implicit motion in static images, we propose a novel approach to address this gap by explicitly incorporating hidden motion into IC assessment. We introduce the motion-inspired image complexity assessment metric (MICM) as a new framework for this purpose. MICM introduces a dual-branch architecture: One branch extracts spatial features from static images, while the other generates short video sequences to analyze latent motion dynamics. To ensure meaningful motion representation, we design a hierarchical loss function that aligns video features with text prompts derived from image-to-text models, refining motion semantics at both local (i.e., frame and word) and global levels. Experiments on three public image complexity assessment (ICA) databases demonstrate that our approach, MICM, significantly outperforms state-of-the-art methods, validating its effectiveness. The code will be publicly available upon acceptance of the paper.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"90 ","pages":"Article 103131"},"PeriodicalIF":3.4000,"publicationDate":"2025-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Unlocking implicit motion for evaluating image complexity\",\"authors\":\"Yixiao Li , Xiaoyuan Yang , Yuqing Luo , Hadi Amirpour , Hantao Liu , Wei Zhou\",\"doi\":\"10.1016/j.displa.2025.103131\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Image complexity (IC) plays a critical role in both cognitive science and multimedia computing, influencing visual aesthetics, emotional responses, and tasks such as image classification and enhancement. However, defining and quantifying IC remains challenging due to its multifaceted nature, which encompasses both objective attributes (e.g., detail, structure) and subjective human perception. While traditional methods rely on entropy-based or multidimensional approaches, and recent advances employ machine learning and shallow neural networks, these techniques often fail to fully capture the subjective aspects of IC. Inspired by the fact that the human visual system inherently perceives implicit motion in static images, we propose a novel approach to address this gap by explicitly incorporating hidden motion into IC assessment. We introduce the motion-inspired image complexity assessment metric (MICM) as a new framework for this purpose. MICM introduces a dual-branch architecture: One branch extracts spatial features from static images, while the other generates short video sequences to analyze latent motion dynamics. To ensure meaningful motion representation, we design a hierarchical loss function that aligns video features with text prompts derived from image-to-text models, refining motion semantics at both local (i.e., frame and word) and global levels. Experiments on three public image complexity assessment (ICA) databases demonstrate that our approach, MICM, significantly outperforms state-of-the-art methods, validating its effectiveness. The code will be publicly available upon acceptance of the paper.</div></div>\",\"PeriodicalId\":50570,\"journal\":{\"name\":\"Displays\",\"volume\":\"90 \",\"pages\":\"Article 103131\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2025-07-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Displays\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0141938225001684\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Displays","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0141938225001684","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
Unlocking implicit motion for evaluating image complexity
Image complexity (IC) plays a critical role in both cognitive science and multimedia computing, influencing visual aesthetics, emotional responses, and tasks such as image classification and enhancement. However, defining and quantifying IC remains challenging due to its multifaceted nature, which encompasses both objective attributes (e.g., detail, structure) and subjective human perception. While traditional methods rely on entropy-based or multidimensional approaches, and recent advances employ machine learning and shallow neural networks, these techniques often fail to fully capture the subjective aspects of IC. Inspired by the fact that the human visual system inherently perceives implicit motion in static images, we propose a novel approach to address this gap by explicitly incorporating hidden motion into IC assessment. We introduce the motion-inspired image complexity assessment metric (MICM) as a new framework for this purpose. MICM introduces a dual-branch architecture: One branch extracts spatial features from static images, while the other generates short video sequences to analyze latent motion dynamics. To ensure meaningful motion representation, we design a hierarchical loss function that aligns video features with text prompts derived from image-to-text models, refining motion semantics at both local (i.e., frame and word) and global levels. Experiments on three public image complexity assessment (ICA) databases demonstrate that our approach, MICM, significantly outperforms state-of-the-art methods, validating its effectiveness. The code will be publicly available upon acceptance of the paper.
期刊介绍:
Displays is the international journal covering the research and development of display technology, its effective presentation and perception of information, and applications and systems including display-human interface.
Technical papers on practical developments in Displays technology provide an effective channel to promote greater understanding and cross-fertilization across the diverse disciplines of the Displays community. Original research papers solving ergonomics issues at the display-human interface advance effective presentation of information. Tutorial papers covering fundamentals intended for display technologies and human factor engineers new to the field will also occasionally featured.