Two Stream Dynamic Threshold Network for Weakly-Supervised Temporal Action Localization

Hao Yan, Jun Cheng, Qieshi Zhang, Ziliang Ren, Shijie Sun, Qin Cheng
{"title":"Two Stream Dynamic Threshold Network for Weakly-Supervised Temporal Action Localization","authors":"Hao Yan, Jun Cheng, Qieshi Zhang, Ziliang Ren, Shijie Sun, Qin Cheng","doi":"10.1109/RCAR52367.2021.9517513","DOIUrl":null,"url":null,"abstract":"The current mainstream temporal action localization methods are fully-supervised, which needs a lot of time to annotate the required frame-level labels. The emergence of weakly-supervised methods greatly alleviates such problem, as they only require video-level labels to train the models. In order to generate accurate action localization boundary, recent two stream consensus network (TSCN) proposes an attention normalization loss to explicitly force the attention values approach extreme values to avoid ambiguity. However, most previous methods including TSCN use a fixed threshold applied on attention loss to polarize the attention values, which lacks flexibility for different videos. In this paper, we propose a Dynamic Threshold Weakly-supervised action Localization (DH-WTAL) method to address this problem. The proposed DH-WTAL features a dynamic attention threshold decision for the attention mechanism. Specifically, the dynamic threshold can dynamically control the number of snippets selected for different videos, which further adjust the extreme values of the attention mechanism for different videos accordingly. Extensive experiments demonstrate that the proposed DH-WTAL outperforms the TSCN baseline, and ablation study validates the effectiveness of this method.","PeriodicalId":232892,"journal":{"name":"2021 IEEE International Conference on Real-time Computing and Robotics (RCAR)","volume":"60 6-7","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Real-time Computing and Robotics (RCAR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/RCAR52367.2021.9517513","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The current mainstream temporal action localization methods are fully-supervised, which needs a lot of time to annotate the required frame-level labels. The emergence of weakly-supervised methods greatly alleviates such problem, as they only require video-level labels to train the models. In order to generate accurate action localization boundary, recent two stream consensus network (TSCN) proposes an attention normalization loss to explicitly force the attention values approach extreme values to avoid ambiguity. However, most previous methods including TSCN use a fixed threshold applied on attention loss to polarize the attention values, which lacks flexibility for different videos. In this paper, we propose a Dynamic Threshold Weakly-supervised action Localization (DH-WTAL) method to address this problem. The proposed DH-WTAL features a dynamic attention threshold decision for the attention mechanism. Specifically, the dynamic threshold can dynamically control the number of snippets selected for different videos, which further adjust the extreme values of the attention mechanism for different videos accordingly. Extensive experiments demonstrate that the proposed DH-WTAL outperforms the TSCN baseline, and ablation study validates the effectiveness of this method.
弱监督时间动作定位的双流动态阈值网络
目前主流的时间动作定位方法是全监督的,需要大量的时间来标注所需的帧级标签。弱监督方法的出现极大地缓解了这一问题,因为它们只需要视频级别的标签来训练模型。为了生成准确的动作定位边界,最近的两流共识网络(TSCN)提出了一种注意归一化损失,明确地迫使注意值接近极值,以避免歧义。然而,包括TSCN在内的以往方法大多采用固定的注意损失阈值来极化注意值,缺乏针对不同视频的灵活性。在本文中,我们提出了一种动态阈值弱监督动作定位(DH-WTAL)方法来解决这个问题。提出的DH-WTAL具有动态注意阈值决策的特点。具体来说,动态阈值可以动态控制不同视频选择的片段数,从而进一步调整不同视频关注机制的极值。大量实验表明,提出的DH-WTAL优于TSCN基线,烧蚀研究验证了该方法的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信