Towards gaze-based video annotation

2016 Sixth International Conference on Image Processing Theory, Tools and Applications (IPTA) Pub Date : 2016-12-01 DOI:10.1109/IPTA.2016.7821028

Mohamed Soliman, H. R. Tavakoli, Jorma T. Laaksonen

{"title":"Towards gaze-based video annotation","authors":"Mohamed Soliman, H. R. Tavakoli, Jorma T. Laaksonen","doi":"10.1109/IPTA.2016.7821028","DOIUrl":null,"url":null,"abstract":"This paper presents our efforts towards a framework for video annotation using gaze. In computer vision, video annotation (VA) is an essential step in providing a ground truth for the evaluation of object detection and tracking techniques. VA is a demanding element in the development of video processing algorithms, where each object of interest should be manually labelled. Although the community has handled VA for a long time, the size of new data sets and the complexity of the new tasks pushes us to revisit it. A barrier towards automated video annotation is the recognition of the object of interest and tracking it over image sequences. To tackle this problem, we employ the concept of visual attention for enhancing video annotation. In an image, human attention naturally grasps interesting areas that provide valuable information for extracting the objects of interest, which can be exploited to annotate videos. Under task-based gaze recording, we utilize an observer's gaze to filter seed object detector responses in a video sequence. The filtered boxes are then passed to an appearance-based tracking algorithm. We evaluate the gaze usefulness by comparing the algorithm with gaze and without it. We show that eye gaze is an influential cue for enhancing the automated video annotation, improving the annotation significantly.","PeriodicalId":123429,"journal":{"name":"2016 Sixth International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 Sixth International Conference on Image Processing Theory, Tools and Applications (IPTA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPTA.2016.7821028","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

This paper presents our efforts towards a framework for video annotation using gaze. In computer vision, video annotation (VA) is an essential step in providing a ground truth for the evaluation of object detection and tracking techniques. VA is a demanding element in the development of video processing algorithms, where each object of interest should be manually labelled. Although the community has handled VA for a long time, the size of new data sets and the complexity of the new tasks pushes us to revisit it. A barrier towards automated video annotation is the recognition of the object of interest and tracking it over image sequences. To tackle this problem, we employ the concept of visual attention for enhancing video annotation. In an image, human attention naturally grasps interesting areas that provide valuable information for extracting the objects of interest, which can be exploited to annotate videos. Under task-based gaze recording, we utilize an observer's gaze to filter seed object detector responses in a video sequence. The filtered boxes are then passed to an appearance-based tracking algorithm. We evaluate the gaze usefulness by comparing the algorithm with gaze and without it. We show that eye gaze is an influential cue for enhancing the automated video annotation, improving the annotation significantly.

查看原文本刊更多论文

面向基于注视的视频注释

本文介绍了基于注视的视频注释框架。在计算机视觉中，视频标注(VA)是为评估目标检测和跟踪技术提供基础真实值的重要步骤。在视频处理算法的开发中，图像处理是一个要求很高的元素，其中每个感兴趣的对象都应该手工标记。虽然社区已经处理VA很长时间了，但新数据集的规模和新任务的复杂性促使我们重新审视它。实现自动视频注释的一个障碍是识别感兴趣的对象并在图像序列上跟踪它。为了解决这个问题，我们采用视觉注意的概念来增强视频注释。在图像中，人类的注意力自然会抓住有趣的区域，为提取感兴趣的对象提供有价值的信息，可以利用这些信息来注释视频。在基于任务的注视记录中，我们利用观察者的注视来过滤视频序列中的种子目标检测器响应。然后将过滤后的盒子传递给基于外观的跟踪算法。我们通过比较有注视和没有注视的算法来评估注视的有效性。研究表明，人眼注视是增强视频自动标注的重要线索，显著提高了视频标注的质量。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2016 Sixth International Conference on Image Processing Theory, Tools and Applications (IPTA)

自引率

0.00%

发文量