Robust Action Segmentation from Timestamp Supervision

BMVC : proceedings of the British Machine Vision Conference. British Machine Vision Conference Pub Date : 2022-10-12 DOI:10.48550/arXiv.2210.06501

Yaser Souri, Yazan Abu Farha, Emad Bahrami, G. Francesca, Juergen Gall

{"title":"Robust Action Segmentation from Timestamp Supervision","authors":"Yaser Souri, Yazan Abu Farha, Emad Bahrami, G. Francesca, Juergen Gall","doi":"10.48550/arXiv.2210.06501","DOIUrl":null,"url":null,"abstract":"Action segmentation is the task of predicting an action label for each frame of an untrimmed video. As obtaining annotations to train an approach for action segmentation in a fully supervised way is expensive, various approaches have been proposed to train action segmentation models using different forms of weak supervision, e.g., action transcripts, action sets, or more recently timestamps. Timestamp supervision is a promising type of weak supervision as obtaining one timestamp per action is less expensive than annotating all frames, but it provides more information than other forms of weak supervision. However, previous works assume that every action instance is annotated with a timestamp, which is a restrictive assumption since it assumes that annotators do not miss any action. In this work, we relax this restrictive assumption and take missing annotations for some action instances into account. We show that our approach is more robust to missing annotations compared to other approaches and various baselines.","PeriodicalId":72437,"journal":{"name":"BMVC : proceedings of the British Machine Vision Conference. British Machine Vision Conference","volume":"51 1","pages":"392"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMVC : proceedings of the British Machine Vision Conference. British Machine Vision Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2210.06501","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

Action segmentation is the task of predicting an action label for each frame of an untrimmed video. As obtaining annotations to train an approach for action segmentation in a fully supervised way is expensive, various approaches have been proposed to train action segmentation models using different forms of weak supervision, e.g., action transcripts, action sets, or more recently timestamps. Timestamp supervision is a promising type of weak supervision as obtaining one timestamp per action is less expensive than annotating all frames, but it provides more information than other forms of weak supervision. However, previous works assume that every action instance is annotated with a timestamp, which is a restrictive assumption since it assumes that annotators do not miss any action. In this work, we relax this restrictive assumption and take missing annotations for some action instances into account. We show that our approach is more robust to missing annotations compared to other approaches and various baselines.

查看原文本刊更多论文

基于时间戳监督的鲁棒动作分割

动作分割的任务是预测未修剪视频的每一帧的动作标签。由于以完全监督的方式获得注释来训练动作分割方法是昂贵的，因此已经提出了各种方法来使用不同形式的弱监督来训练动作分割模型，例如，动作转录本、动作集或最近的时间戳。时间戳监督是一种很有前途的弱监督类型，因为每个动作获取一个时间戳比注释所有帧的成本要低，但它比其他形式的弱监督提供了更多的信息。但是，前面的工作假设每个操作实例都使用时间戳进行注释，这是一个限制性假设，因为它假设注释者不会错过任何操作。在这项工作中，我们放宽了这种限制性假设，并考虑了一些动作实例的缺失注释。我们表明，与其他方法和各种基线相比，我们的方法对缺失注释的鲁棒性更强。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

BMVC : proceedings of the British Machine Vision Conference. British Machine Vision Conference

自引率

0.00%

发文量