开发先进的机器学习模型，用于识别数字媒体中的虚拟蒙太奇

IF 2.4 3区计算机科学 Q2 COMPUTER SCIENCE, CYBERNETICS

Entertainment Computing Pub Date : 2025-07-22 DOI:10.1016/j.entcom.2025.101002

Yuxuan Liu

{"title":"开发先进的机器学习模型，用于识别数字媒体中的虚拟蒙太奇","authors":"Yuxuan Liu","doi":"10.1016/j.entcom.2025.101002","DOIUrl":null,"url":null,"abstract":"<div><div>Virtual montages are compositions that blend several digital materials to produce a new, visually appealing narrative and are becoming increasingly popular, as a result of the quick development of digital media. Virtual montages have emerged as a potent technique for improving visual storytelling and content presentation across platforms in the digital age. However, the quick expansion of digital content has made advanced methods for effectively recognizing and classifying these montages necessary. This work suggests a novel method for virtual montage identification across a broad range of visual themes, content types, and media resolutions, and resolutions utilizing an Adaptive Flower Pollination Optimized Mutual Information with Naïve Bayes (AFPO-MI-NB) model. The research’s dataset comprises a variety of content from virtual montage detection dataset which enables the model to generalize effectively to data from the real world. To improve the model’s capacity to handle various image and video qualities, pre-processing methods, including data augmentation and pixel value normalization are used. Convolutional neural networks (CNN) are used for feature extraction to capture spatial patterns. The model uses AFPO-MI-NB classification to enhance classification accuracy and computational efficiency. AFPO allows for more adaptable feature weights, MI evaluates the relationship between visual features and the classification task, and NB classifier processes feature matrices for binary classification. The hybrid approach strengthens feature selection through global and local searches, resulting in a model that enhances classification accuracy and improves computational efficiency. According to the experimental data, this model provides a reliable solution for virtual montage identification across a variety of media formats, outperforming current techniques in terms of F1-score of 0.95, recall of 0.92, and precision of 0.97. This work has significant applications in automated digital media analysis, copyright enforcement, and content control.</div></div>","PeriodicalId":55997,"journal":{"name":"Entertainment Computing","volume":"55 ","pages":"Article 101002"},"PeriodicalIF":2.4000,"publicationDate":"2025-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Developing an advanced machine learning model for identifying virtual montages in digital media\",\"authors\":\"Yuxuan Liu\",\"doi\":\"10.1016/j.entcom.2025.101002\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Virtual montages are compositions that blend several digital materials to produce a new, visually appealing narrative and are becoming increasingly popular, as a result of the quick development of digital media. Virtual montages have emerged as a potent technique for improving visual storytelling and content presentation across platforms in the digital age. However, the quick expansion of digital content has made advanced methods for effectively recognizing and classifying these montages necessary. This work suggests a novel method for virtual montage identification across a broad range of visual themes, content types, and media resolutions, and resolutions utilizing an Adaptive Flower Pollination Optimized Mutual Information with Naïve Bayes (AFPO-MI-NB) model. The research’s dataset comprises a variety of content from virtual montage detection dataset which enables the model to generalize effectively to data from the real world. To improve the model’s capacity to handle various image and video qualities, pre-processing methods, including data augmentation and pixel value normalization are used. Convolutional neural networks (CNN) are used for feature extraction to capture spatial patterns. The model uses AFPO-MI-NB classification to enhance classification accuracy and computational efficiency. AFPO allows for more adaptable feature weights, MI evaluates the relationship between visual features and the classification task, and NB classifier processes feature matrices for binary classification. The hybrid approach strengthens feature selection through global and local searches, resulting in a model that enhances classification accuracy and improves computational efficiency. According to the experimental data, this model provides a reliable solution for virtual montage identification across a variety of media formats, outperforming current techniques in terms of F1-score of 0.95, recall of 0.92, and precision of 0.97. This work has significant applications in automated digital media analysis, copyright enforcement, and content control.</div></div>\",\"PeriodicalId\":55997,\"journal\":{\"name\":\"Entertainment Computing\",\"volume\":\"55 \",\"pages\":\"Article 101002\"},\"PeriodicalIF\":2.4000,\"publicationDate\":\"2025-07-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Entertainment Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1875952125000825\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, CYBERNETICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Entertainment Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1875952125000825","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, CYBERNETICS","Score":null,"Total":0}

引用次数: 0

摘要

随着数字媒体的快速发展，虚拟蒙太奇是将几种数字材料混合在一起，产生一种新的、视觉上吸引人的叙事方式的作品，它正变得越来越受欢迎。在数字时代，虚拟蒙太奇已经成为一种有效的技术，可以改善视觉叙事和跨平台的内容呈现。然而，数字内容的快速扩展使得有效识别和分类这些蒙太奇的先进方法成为必要。这项工作提出了一种新的虚拟蒙太奇识别方法，可以跨越广泛的视觉主题、内容类型和媒体分辨率，并利用Naïve贝叶斯（AFPO-MI-NB）模型的自适应授粉优化互信息。本研究的数据集包含来自虚拟蒙太奇检测数据集的各种内容，使模型能够有效地推广到来自现实世界的数据。为了提高模型处理各种图像和视频质量的能力，采用了数据增强和像素值归一化等预处理方法。卷积神经网络（CNN）用于特征提取，以捕获空间模式。该模型采用AFPO-MI-NB分类，提高了分类精度和计算效率。AFPO允许更具适应性的特征权重，MI评估视觉特征与分类任务之间的关系，NB分类器处理特征矩阵进行二值分类。混合方法通过全局和局部搜索加强特征选择，得到的模型提高了分类精度，提高了计算效率。实验数据表明，该模型为各种媒体格式的虚拟蒙太奇识别提供了可靠的解决方案，其f1得分为0.95，召回率为0.92，精度为0.97，优于现有技术。这项工作在自动数字媒体分析、版权执行和内容控制方面具有重要的应用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Developing an advanced machine learning model for identifying virtual montages in digital media

Virtual montages are compositions that blend several digital materials to produce a new, visually appealing narrative and are becoming increasingly popular, as a result of the quick development of digital media. Virtual montages have emerged as a potent technique for improving visual storytelling and content presentation across platforms in the digital age. However, the quick expansion of digital content has made advanced methods for effectively recognizing and classifying these montages necessary. This work suggests a novel method for virtual montage identification across a broad range of visual themes, content types, and media resolutions, and resolutions utilizing an Adaptive Flower Pollination Optimized Mutual Information with Naïve Bayes (AFPO-MI-NB) model. The research’s dataset comprises a variety of content from virtual montage detection dataset which enables the model to generalize effectively to data from the real world. To improve the model’s capacity to handle various image and video qualities, pre-processing methods, including data augmentation and pixel value normalization are used. Convolutional neural networks (CNN) are used for feature extraction to capture spatial patterns. The model uses AFPO-MI-NB classification to enhance classification accuracy and computational efficiency. AFPO allows for more adaptable feature weights, MI evaluates the relationship between visual features and the classification task, and NB classifier processes feature matrices for binary classification. The hybrid approach strengthens feature selection through global and local searches, resulting in a model that enhances classification accuracy and improves computational efficiency. According to the experimental data, this model provides a reliable solution for virtual montage identification across a variety of media formats, outperforming current techniques in terms of F1-score of 0.95, recall of 0.92, and precision of 0.97. This work has significant applications in automated digital media analysis, copyright enforcement, and content control.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Entertainment Computing Computer Science-Human-Computer Interaction

CiteScore

5.90

自引率

7.10%

发文量

期刊介绍： Entertainment Computing publishes original, peer-reviewed research articles and serves as a forum for stimulating and disseminating innovative research ideas, emerging technologies, empirical investigations, state-of-the-art methods and tools in all aspects of digital entertainment, new media, entertainment computing, gaming, robotics, toys and applications among researchers, engineers, social scientists, artists and practitioners. Theoretical, technical, empirical, survey articles and case studies are all appropriate to the journal.