Patch-wise Weakly Supervised Learning for Object Localization in Video

2019 International Conference on Artificial Intelligence in Information and Communication (ICAIIC) Pub Date : 2019-02-01 DOI:10.1109/ICAIIC.2019.8668987

Dong Huh, Taekyung Kim, Jaeil Kim

{"title":"Patch-wise Weakly Supervised Learning for Object Localization in Video","authors":"Dong Huh, Taekyung Kim, Jaeil Kim","doi":"10.1109/ICAIIC.2019.8668987","DOIUrl":null,"url":null,"abstract":"Object localization in video is to predict the location and image boundaries of objects of interest in sequential scenes. Despite numerous methods being developed for the task, there are still challenging issues, such as labor-intensive data preparation. In this paper, we propose a patch-wise approach with weak supervision to resolve those issues in the object localization. We first train an patch-wise object classifier based on convolutional neural network with simple labeling about object classes, instead of the bounding box annotation. Then, the object regions are estimated using the class activation maps of the classifier for each patch. The patch-wise classifier can learn more relevant features of objects from the patches containing various parts of them. In addition, background patches for weakly-supervised learning can be easily prepared. Experiments using the visual object tracking challenge data set showed that the patch-wise weakly supervised approach is effective in the object localization in video.","PeriodicalId":273383,"journal":{"name":"2019 International Conference on Artificial Intelligence in Information and Communication (ICAIIC)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on Artificial Intelligence in Information and Communication (ICAIIC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICAIIC.2019.8668987","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Object localization in video is to predict the location and image boundaries of objects of interest in sequential scenes. Despite numerous methods being developed for the task, there are still challenging issues, such as labor-intensive data preparation. In this paper, we propose a patch-wise approach with weak supervision to resolve those issues in the object localization. We first train an patch-wise object classifier based on convolutional neural network with simple labeling about object classes, instead of the bounding box annotation. Then, the object regions are estimated using the class activation maps of the classifier for each patch. The patch-wise classifier can learn more relevant features of objects from the patches containing various parts of them. In addition, background patches for weakly-supervised learning can be easily prepared. Experiments using the visual object tracking challenge data set showed that the patch-wise weakly supervised approach is effective in the object localization in video.

查看原文本刊更多论文

视频中基于补丁的弱监督学习对象定位

视频中的目标定位是预测序列场景中感兴趣对象的位置和图像边界。尽管为这项任务开发了许多方法，但仍然存在一些具有挑战性的问题，例如劳动密集型的数据准备。在本文中，我们提出了一种弱监督的补丁智能方法来解决这些问题。我们首先训练了一个基于卷积神经网络的基于补丁的对象分类器，该分类器对对象类进行了简单的标记，而不是边界框标注。然后，使用分类器的类激活图对每个patch的目标区域进行估计。patch-wise分类器可以从包含物体各个部分的patch中学习到物体的更多相关特征。此外，为弱监督学习准备背景补丁也很容易。利用视觉目标跟踪挑战数据集进行的实验表明，基于补丁的弱监督方法在视频目标定位中是有效的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2019 International Conference on Artificial Intelligence in Information and Communication (ICAIIC)

自引率

0.00%

发文量