A Temporal Convolutional Network for Weakly Supervised Action Segmentation

Z. Zou, Jiaqi Zou, Junzhe Liu, Songlin Sun
{"title":"A Temporal Convolutional Network for Weakly Supervised Action Segmentation","authors":"Z. Zou, Jiaqi Zou, Junzhe Liu, Songlin Sun","doi":"10.1109/IC-NIDC54101.2021.9660442","DOIUrl":null,"url":null,"abstract":"The task of video action segmentation in weakly supervised learning is one of the key points of video content understanding. The ground truth only provides a set of actions but not frame level features. A popular type uses a neural network framework to train the prediction model. Our key contribution is a new Hidden Markov Model (HMM) grounded on a Temporal Convolutional Network (TCN) to label video frames, and thus generate a pseudo-ground truth for the subsequent pseudo-supervised training. In testing, we use Viterbi algorithm to generate the time action sequence to be selected, and finally get the largest posteriori sequence. We evaluate the performance of action segmentation task on breakfast dataset. The research experiments on this dataset show that our model gets efficient performance.","PeriodicalId":264468,"journal":{"name":"2021 7th IEEE International Conference on Network Intelligence and Digital Content (IC-NIDC)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 7th IEEE International Conference on Network Intelligence and Digital Content (IC-NIDC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IC-NIDC54101.2021.9660442","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The task of video action segmentation in weakly supervised learning is one of the key points of video content understanding. The ground truth only provides a set of actions but not frame level features. A popular type uses a neural network framework to train the prediction model. Our key contribution is a new Hidden Markov Model (HMM) grounded on a Temporal Convolutional Network (TCN) to label video frames, and thus generate a pseudo-ground truth for the subsequent pseudo-supervised training. In testing, we use Viterbi algorithm to generate the time action sequence to be selected, and finally get the largest posteriori sequence. We evaluate the performance of action segmentation task on breakfast dataset. The research experiments on this dataset show that our model gets efficient performance.
弱监督动作分割的时间卷积网络
弱监督学习中的视频动作分割任务是视频内容理解的关键之一。ground truth只提供一组动作,而不提供框架级别的功能。一种流行的类型使用神经网络框架来训练预测模型。我们的关键贡献是基于时间卷积网络(TCN)的新的隐马尔可夫模型(HMM)来标记视频帧,从而为后续的伪监督训练生成伪基础真理。在测试中,我们使用Viterbi算法生成待选的时间动作序列,最终得到最大后验序列。我们在早餐数据集上评估动作分割任务的性能。在该数据集上的研究实验表明,我们的模型具有良好的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信