通过多模态深度特征融合与时间同步评论加强视频谣言检测

IF 7.4 1区 管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS
Ming Yin , Wei Chen , Dan Zhu , Jijiao Jiang
{"title":"通过多模态深度特征融合与时间同步评论加强视频谣言检测","authors":"Ming Yin ,&nbsp;Wei Chen ,&nbsp;Dan Zhu ,&nbsp;Jijiao Jiang","doi":"10.1016/j.ipm.2024.103935","DOIUrl":null,"url":null,"abstract":"<div><div>Rumors in videos have a stronger propagation compared to traditional text or image rumors. Most current studies on video rumor detection often rely on combining user and video modal information while neglecting the internal multimodal aspects of the video and the relationship between user comments and local segment of the video. To address this problem, we propose a method called Time-Sync Comment Enhanced Multimodal Deep Feature Fusion Model (TSC-MDFFM). It introduces time-sync comments to enhance the propagation structure of videos on social networks, supplementing missing contextual or additional information in videos. Time-sync comments focus on expressing users' views on specific points in time in the video, which helps to obtain more valuable segments from videos with high density information. The time interval from one keyframe to the next in a video is defined as a local segment. We thoroughly described this segment using time-sync comments, video keyframes, and video subtitle texts. The local segment sequences are ordered based on the video timeline and assigned time information, then fused to create the local feature representation of the video. Subsequently, we fused the text features, video motion features, and visual features of video comments at the feature level to represent the global features of the video. This feature not only captures the overall propagation trend of video content, but also provides a deep understanding of the overall features of the video. Finally, we will integrate local and global features for video rumor classification, to combine the local and global information of the video. We created a dataset called TSC-VRD, which includes time-sync comments and encompasses all visible information in videos. Extensive experimental results have shown superior performance of our proposed model compared to existing methods on the TSC-VRD dataset.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 1","pages":"Article 103935"},"PeriodicalIF":7.4000,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Enhancing video rumor detection through multimodal deep feature fusion with time-sync comments\",\"authors\":\"Ming Yin ,&nbsp;Wei Chen ,&nbsp;Dan Zhu ,&nbsp;Jijiao Jiang\",\"doi\":\"10.1016/j.ipm.2024.103935\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Rumors in videos have a stronger propagation compared to traditional text or image rumors. Most current studies on video rumor detection often rely on combining user and video modal information while neglecting the internal multimodal aspects of the video and the relationship between user comments and local segment of the video. To address this problem, we propose a method called Time-Sync Comment Enhanced Multimodal Deep Feature Fusion Model (TSC-MDFFM). It introduces time-sync comments to enhance the propagation structure of videos on social networks, supplementing missing contextual or additional information in videos. Time-sync comments focus on expressing users' views on specific points in time in the video, which helps to obtain more valuable segments from videos with high density information. The time interval from one keyframe to the next in a video is defined as a local segment. We thoroughly described this segment using time-sync comments, video keyframes, and video subtitle texts. The local segment sequences are ordered based on the video timeline and assigned time information, then fused to create the local feature representation of the video. Subsequently, we fused the text features, video motion features, and visual features of video comments at the feature level to represent the global features of the video. This feature not only captures the overall propagation trend of video content, but also provides a deep understanding of the overall features of the video. Finally, we will integrate local and global features for video rumor classification, to combine the local and global information of the video. We created a dataset called TSC-VRD, which includes time-sync comments and encompasses all visible information in videos. Extensive experimental results have shown superior performance of our proposed model compared to existing methods on the TSC-VRD dataset.</div></div>\",\"PeriodicalId\":50365,\"journal\":{\"name\":\"Information Processing & Management\",\"volume\":\"62 1\",\"pages\":\"Article 103935\"},\"PeriodicalIF\":7.4000,\"publicationDate\":\"2024-11-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information Processing & Management\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0306457324002942\",\"RegionNum\":1,\"RegionCategory\":\"管理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Processing & Management","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0306457324002942","RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

摘要

与传统的文字或图像谣言相比,视频中的谣言具有更强的传播性。目前大多数关于视频谣言检测的研究往往依赖于用户和视频模态信息的结合,却忽视了视频内部的多模态方面以及用户评论与视频局部片段之间的关系。针对这一问题,我们提出了一种名为时间同步评论增强多模态深度特征融合模型(TSC-MDFFM)的方法。它引入时间同步评论来增强视频在社交网络上的传播结构,补充视频中缺失的上下文信息或附加信息。时间同步评论侧重于表达用户对视频中特定时间点的看法,有助于从高密度信息的视频中获取更有价值的片段。视频中从一个关键帧到下一个关键帧的时间间隔被定义为局部片段。我们利用时间同步注释、视频关键帧和视频字幕文本对这一片段进行全面描述。局部片段序列根据视频时间轴和指定的时间信息进行排序,然后融合以创建视频的局部特征表示。随后,我们将文本特征、视频运动特征和视频评论的视觉特征在特征级别上进行融合,以表示视频的全局特征。这一特征不仅能捕捉视频内容的整体传播趋势,还能深入理解视频的整体特征。最后,我们将整合本地和全局特征进行视频谣言分类,将视频的本地信息和全局信息结合起来。我们创建了一个名为 TSC-VRD 的数据集,其中包括时间同步评论,涵盖了视频中所有可见信息。广泛的实验结果表明,在 TSC-VRD 数据集上,与现有方法相比,我们提出的模型具有更优越的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Enhancing video rumor detection through multimodal deep feature fusion with time-sync comments
Rumors in videos have a stronger propagation compared to traditional text or image rumors. Most current studies on video rumor detection often rely on combining user and video modal information while neglecting the internal multimodal aspects of the video and the relationship between user comments and local segment of the video. To address this problem, we propose a method called Time-Sync Comment Enhanced Multimodal Deep Feature Fusion Model (TSC-MDFFM). It introduces time-sync comments to enhance the propagation structure of videos on social networks, supplementing missing contextual or additional information in videos. Time-sync comments focus on expressing users' views on specific points in time in the video, which helps to obtain more valuable segments from videos with high density information. The time interval from one keyframe to the next in a video is defined as a local segment. We thoroughly described this segment using time-sync comments, video keyframes, and video subtitle texts. The local segment sequences are ordered based on the video timeline and assigned time information, then fused to create the local feature representation of the video. Subsequently, we fused the text features, video motion features, and visual features of video comments at the feature level to represent the global features of the video. This feature not only captures the overall propagation trend of video content, but also provides a deep understanding of the overall features of the video. Finally, we will integrate local and global features for video rumor classification, to combine the local and global information of the video. We created a dataset called TSC-VRD, which includes time-sync comments and encompasses all visible information in videos. Extensive experimental results have shown superior performance of our proposed model compared to existing methods on the TSC-VRD dataset.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Information Processing & Management
Information Processing & Management 工程技术-计算机:信息系统
CiteScore
17.00
自引率
11.60%
发文量
276
审稿时长
39 days
期刊介绍: Information Processing and Management is dedicated to publishing cutting-edge original research at the convergence of computing and information science. Our scope encompasses theory, methods, and applications across various domains, including advertising, business, health, information science, information technology marketing, and social computing. We aim to cater to the interests of both primary researchers and practitioners by offering an effective platform for the timely dissemination of advanced and topical issues in this interdisciplinary field. The journal places particular emphasis on original research articles, research survey articles, research method articles, and articles addressing critical applications of research. Join us in advancing knowledge and innovation at the intersection of computing and information science.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信