基于时空语义图和增强注意机制的无监督视频摘要

IF 4.5 2区 计算机科学 Q1 COMPUTER SCIENCE, CYBERNETICS
Xin Cheng;Lei Yang;Rui Li
{"title":"基于时空语义图和增强注意机制的无监督视频摘要","authors":"Xin Cheng;Lei Yang;Rui Li","doi":"10.1109/TCSS.2025.3579570","DOIUrl":null,"url":null,"abstract":"Generative adversarial networks (GANs) have demonstrated potential in enhancing keyframe selection and video reconstruction via adversarial training among unsupervised approaches. Nevertheless, GANs struggle to encapsulate the intricate spatiotemporal dynamics in videos, which is essential for producing coherent and informative summaries. To address these challenges, we introduce an unsupervised video summarization framework that synergistically integrates temporal–spatial semantic graphs (TSSGraphs) with a bilinear additive attention (BAA) mechanism. TSSGraphs are designed to effectively model temporal and spatial relationships among video frames by combining temporal convolution and dynamic edge convolution, thereby extracting salient features while mitigating model complexity. The BAA mechanism enhances the framework’s ability to capture critical motion information by addressing feature sparsity and eliminating redundant parameters, ensuring robust attention to significant motion dynamics. Experimental assessments on the SumMe and TVSum benchmark datasets reveal that our method attains improvements of up to 4.0% and 3.3% in F-score, respectively, compared to current methodologies. Moreover, our system demonstrates diminished parameter overhead throughout training and inference stages, particularly excelling in contexts with significant motion content.","PeriodicalId":13044,"journal":{"name":"IEEE Transactions on Computational Social Systems","volume":"12 5","pages":"3751-3764"},"PeriodicalIF":4.5000,"publicationDate":"2025-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Unsupervised Video Summarization Based on Spatiotemporal Semantic Graph and Enhanced Attention Mechanism\",\"authors\":\"Xin Cheng;Lei Yang;Rui Li\",\"doi\":\"10.1109/TCSS.2025.3579570\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Generative adversarial networks (GANs) have demonstrated potential in enhancing keyframe selection and video reconstruction via adversarial training among unsupervised approaches. Nevertheless, GANs struggle to encapsulate the intricate spatiotemporal dynamics in videos, which is essential for producing coherent and informative summaries. To address these challenges, we introduce an unsupervised video summarization framework that synergistically integrates temporal–spatial semantic graphs (TSSGraphs) with a bilinear additive attention (BAA) mechanism. TSSGraphs are designed to effectively model temporal and spatial relationships among video frames by combining temporal convolution and dynamic edge convolution, thereby extracting salient features while mitigating model complexity. The BAA mechanism enhances the framework’s ability to capture critical motion information by addressing feature sparsity and eliminating redundant parameters, ensuring robust attention to significant motion dynamics. Experimental assessments on the SumMe and TVSum benchmark datasets reveal that our method attains improvements of up to 4.0% and 3.3% in F-score, respectively, compared to current methodologies. Moreover, our system demonstrates diminished parameter overhead throughout training and inference stages, particularly excelling in contexts with significant motion content.\",\"PeriodicalId\":13044,\"journal\":{\"name\":\"IEEE Transactions on Computational Social Systems\",\"volume\":\"12 5\",\"pages\":\"3751-3764\"},\"PeriodicalIF\":4.5000,\"publicationDate\":\"2025-07-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Computational Social Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11077719/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, CYBERNETICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Computational Social Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11077719/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, CYBERNETICS","Score":null,"Total":0}
引用次数: 0

摘要

生成对抗网络(GANs)在增强关键帧选择和视频重建方面已经显示出潜力,通过在无监督方法之间进行对抗训练。然而,gan很难在视频中封装复杂的时空动态,这对于产生连贯和信息丰富的摘要至关重要。为了解决这些挑战,我们引入了一个无监督视频摘要框架,该框架将时空语义图(TSSGraphs)与双线性可加性注意(BAA)机制协同集成。TSSGraphs通过时间卷积和动态边缘卷积的结合,有效地对视频帧之间的时空关系进行建模,从而在提取显著特征的同时降低模型复杂度。BAA机制通过处理特征稀疏性和消除冗余参数来增强框架捕获关键运动信息的能力,确保对重要运动动力学的鲁棒性关注。在SumMe和TVSum基准数据集上的实验评估表明,与现有方法相比,我们的方法在f分数上分别提高了4.0%和3.3%。此外,我们的系统在整个训练和推理阶段的参数开销减少,特别是在具有重要运动内容的环境中表现出色。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Unsupervised Video Summarization Based on Spatiotemporal Semantic Graph and Enhanced Attention Mechanism
Generative adversarial networks (GANs) have demonstrated potential in enhancing keyframe selection and video reconstruction via adversarial training among unsupervised approaches. Nevertheless, GANs struggle to encapsulate the intricate spatiotemporal dynamics in videos, which is essential for producing coherent and informative summaries. To address these challenges, we introduce an unsupervised video summarization framework that synergistically integrates temporal–spatial semantic graphs (TSSGraphs) with a bilinear additive attention (BAA) mechanism. TSSGraphs are designed to effectively model temporal and spatial relationships among video frames by combining temporal convolution and dynamic edge convolution, thereby extracting salient features while mitigating model complexity. The BAA mechanism enhances the framework’s ability to capture critical motion information by addressing feature sparsity and eliminating redundant parameters, ensuring robust attention to significant motion dynamics. Experimental assessments on the SumMe and TVSum benchmark datasets reveal that our method attains improvements of up to 4.0% and 3.3% in F-score, respectively, compared to current methodologies. Moreover, our system demonstrates diminished parameter overhead throughout training and inference stages, particularly excelling in contexts with significant motion content.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
IEEE Transactions on Computational Social Systems
IEEE Transactions on Computational Social Systems Social Sciences-Social Sciences (miscellaneous)
CiteScore
10.00
自引率
20.00%
发文量
316
期刊介绍: IEEE Transactions on Computational Social Systems focuses on such topics as modeling, simulation, analysis and understanding of social systems from the quantitative and/or computational perspective. "Systems" include man-man, man-machine and machine-machine organizations and adversarial situations as well as social media structures and their dynamics. More specifically, the proposed transactions publishes articles on modeling the dynamics of social systems, methodologies for incorporating and representing socio-cultural and behavioral aspects in computational modeling, analysis of social system behavior and structure, and paradigms for social systems modeling and simulation. The journal also features articles on social network dynamics, social intelligence and cognition, social systems design and architectures, socio-cultural modeling and representation, and computational behavior modeling, and their applications.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信