Israa K. Salman Al-Tameemi, Mohammad-Reza Feizi-Derakhshi, Zari Farhadi, Amir-Reza Feizi-Derakhshi
{"title":"基于内容图像检索和多输入卷积神经网络的视觉情感分类混合模型","authors":"Israa K. Salman Al-Tameemi, Mohammad-Reza Feizi-Derakhshi, Zari Farhadi, Amir-Reza Feizi-Derakhshi","doi":"10.1155/int/5581601","DOIUrl":null,"url":null,"abstract":"<p>With the exponential growth of multimedia content, visual sentiment classification has emerged as a significant research area. However, it poses unique challenges due to the complexity and subjective nature of the visual information. This can be attributed to the significant presence of semantically ambiguous images within the current benchmark datasets, which enhances the performance of sentiment analysis but ignores the differences between various annotators. Moreover, most current methods concentrate on improving local emotional representations that focus on object extraction procedures rather than utilizing robust features that can effectively indicate the relevance of objects within an image through color information. Motivated by these observations, this paper addresses the need for efficient algorithms for labeling and classifying sentiment from visual images by introducing a novel hybrid model, which combines content-based image retrieval (CBIR) and a multi-input convolutional neural network (CNN). The CBIR model extracts color features from all dataset images, creating a numerical representation for each. It compares a query image to dataset images’ features to find similar features. This process continues until the images are grouped according to color similarity, which allows accurate sentimental categories based on similar features and feelings. Then, a multi-input CNN model is utilized to extract and efficiently incorporate high-level contextual visual information. This model comprises 70 layers, with six branches, each containing 11 layers. It seeks to facilitate the fusion of complementary information by incorporating multiple input categories that differ according to the color features extracted by the CBIR technique. This feature enables the model to understand the target and generate more precise predictions fully. The proposed model demonstrates significant improvements over existing algorithms, as evidenced by evaluations of six benchmark datasets of varying sizes. Also, it outperforms the state of the art in sentiment classification accuracy, getting 87.88%, 84.62%, 84.1%, 83.7%, 80.7%, and 91.2% accuracy for the EmotionROI, ArtPhoto, Twitter I, Twitter II, Abstract, and FI datasets, respectively. Furthermore, the model is evaluated on two newly collected large datasets, which confirm its scalability and robustness in handling large-scale sentiment classification tasks, and thus achieves a significant accuracy of 85.21% and 83.72% with the BGETTY and Twitter datasets, respectively. This paper contributes to the advancement of visual sentiment classification by offering a comprehensive solution for analyzing sentiment from images and laying the foundation for further research.</p>","PeriodicalId":14089,"journal":{"name":"International Journal of Intelligent Systems","volume":"2025 1","pages":""},"PeriodicalIF":3.7000,"publicationDate":"2025-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1155/int/5581601","citationCount":"0","resultStr":"{\"title\":\"Hybrid Model for Visual Sentiment Classification Using Content-Based Image Retrieval and Multi-Input Convolutional Neural Network\",\"authors\":\"Israa K. Salman Al-Tameemi, Mohammad-Reza Feizi-Derakhshi, Zari Farhadi, Amir-Reza Feizi-Derakhshi\",\"doi\":\"10.1155/int/5581601\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>With the exponential growth of multimedia content, visual sentiment classification has emerged as a significant research area. However, it poses unique challenges due to the complexity and subjective nature of the visual information. This can be attributed to the significant presence of semantically ambiguous images within the current benchmark datasets, which enhances the performance of sentiment analysis but ignores the differences between various annotators. Moreover, most current methods concentrate on improving local emotional representations that focus on object extraction procedures rather than utilizing robust features that can effectively indicate the relevance of objects within an image through color information. Motivated by these observations, this paper addresses the need for efficient algorithms for labeling and classifying sentiment from visual images by introducing a novel hybrid model, which combines content-based image retrieval (CBIR) and a multi-input convolutional neural network (CNN). The CBIR model extracts color features from all dataset images, creating a numerical representation for each. It compares a query image to dataset images’ features to find similar features. This process continues until the images are grouped according to color similarity, which allows accurate sentimental categories based on similar features and feelings. Then, a multi-input CNN model is utilized to extract and efficiently incorporate high-level contextual visual information. This model comprises 70 layers, with six branches, each containing 11 layers. It seeks to facilitate the fusion of complementary information by incorporating multiple input categories that differ according to the color features extracted by the CBIR technique. This feature enables the model to understand the target and generate more precise predictions fully. The proposed model demonstrates significant improvements over existing algorithms, as evidenced by evaluations of six benchmark datasets of varying sizes. Also, it outperforms the state of the art in sentiment classification accuracy, getting 87.88%, 84.62%, 84.1%, 83.7%, 80.7%, and 91.2% accuracy for the EmotionROI, ArtPhoto, Twitter I, Twitter II, Abstract, and FI datasets, respectively. Furthermore, the model is evaluated on two newly collected large datasets, which confirm its scalability and robustness in handling large-scale sentiment classification tasks, and thus achieves a significant accuracy of 85.21% and 83.72% with the BGETTY and Twitter datasets, respectively. This paper contributes to the advancement of visual sentiment classification by offering a comprehensive solution for analyzing sentiment from images and laying the foundation for further research.</p>\",\"PeriodicalId\":14089,\"journal\":{\"name\":\"International Journal of Intelligent Systems\",\"volume\":\"2025 1\",\"pages\":\"\"},\"PeriodicalIF\":3.7000,\"publicationDate\":\"2025-10-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1155/int/5581601\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Intelligent Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1155/int/5581601\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Intelligent Systems","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1155/int/5581601","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Hybrid Model for Visual Sentiment Classification Using Content-Based Image Retrieval and Multi-Input Convolutional Neural Network
With the exponential growth of multimedia content, visual sentiment classification has emerged as a significant research area. However, it poses unique challenges due to the complexity and subjective nature of the visual information. This can be attributed to the significant presence of semantically ambiguous images within the current benchmark datasets, which enhances the performance of sentiment analysis but ignores the differences between various annotators. Moreover, most current methods concentrate on improving local emotional representations that focus on object extraction procedures rather than utilizing robust features that can effectively indicate the relevance of objects within an image through color information. Motivated by these observations, this paper addresses the need for efficient algorithms for labeling and classifying sentiment from visual images by introducing a novel hybrid model, which combines content-based image retrieval (CBIR) and a multi-input convolutional neural network (CNN). The CBIR model extracts color features from all dataset images, creating a numerical representation for each. It compares a query image to dataset images’ features to find similar features. This process continues until the images are grouped according to color similarity, which allows accurate sentimental categories based on similar features and feelings. Then, a multi-input CNN model is utilized to extract and efficiently incorporate high-level contextual visual information. This model comprises 70 layers, with six branches, each containing 11 layers. It seeks to facilitate the fusion of complementary information by incorporating multiple input categories that differ according to the color features extracted by the CBIR technique. This feature enables the model to understand the target and generate more precise predictions fully. The proposed model demonstrates significant improvements over existing algorithms, as evidenced by evaluations of six benchmark datasets of varying sizes. Also, it outperforms the state of the art in sentiment classification accuracy, getting 87.88%, 84.62%, 84.1%, 83.7%, 80.7%, and 91.2% accuracy for the EmotionROI, ArtPhoto, Twitter I, Twitter II, Abstract, and FI datasets, respectively. Furthermore, the model is evaluated on two newly collected large datasets, which confirm its scalability and robustness in handling large-scale sentiment classification tasks, and thus achieves a significant accuracy of 85.21% and 83.72% with the BGETTY and Twitter datasets, respectively. This paper contributes to the advancement of visual sentiment classification by offering a comprehensive solution for analyzing sentiment from images and laying the foundation for further research.
期刊介绍:
The International Journal of Intelligent Systems serves as a forum for individuals interested in tapping into the vast theories based on intelligent systems construction. With its peer-reviewed format, the journal explores several fascinating editorials written by today''s experts in the field. Because new developments are being introduced each day, there''s much to be learned — examination, analysis creation, information retrieval, man–computer interactions, and more. The International Journal of Intelligent Systems uses charts and illustrations to demonstrate these ground-breaking issues, and encourages readers to share their thoughts and experiences.