{"title":"视频中基于补丁的弱监督学习对象定位","authors":"Dong Huh, Taekyung Kim, Jaeil Kim","doi":"10.1109/ICAIIC.2019.8668987","DOIUrl":null,"url":null,"abstract":"Object localization in video is to predict the location and image boundaries of objects of interest in sequential scenes. Despite numerous methods being developed for the task, there are still challenging issues, such as labor-intensive data preparation. In this paper, we propose a patch-wise approach with weak supervision to resolve those issues in the object localization. We first train an patch-wise object classifier based on convolutional neural network with simple labeling about object classes, instead of the bounding box annotation. Then, the object regions are estimated using the class activation maps of the classifier for each patch. The patch-wise classifier can learn more relevant features of objects from the patches containing various parts of them. In addition, background patches for weakly-supervised learning can be easily prepared. Experiments using the visual object tracking challenge data set showed that the patch-wise weakly supervised approach is effective in the object localization in video.","PeriodicalId":273383,"journal":{"name":"2019 International Conference on Artificial Intelligence in Information and Communication (ICAIIC)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Patch-wise Weakly Supervised Learning for Object Localization in Video\",\"authors\":\"Dong Huh, Taekyung Kim, Jaeil Kim\",\"doi\":\"10.1109/ICAIIC.2019.8668987\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Object localization in video is to predict the location and image boundaries of objects of interest in sequential scenes. Despite numerous methods being developed for the task, there are still challenging issues, such as labor-intensive data preparation. In this paper, we propose a patch-wise approach with weak supervision to resolve those issues in the object localization. We first train an patch-wise object classifier based on convolutional neural network with simple labeling about object classes, instead of the bounding box annotation. Then, the object regions are estimated using the class activation maps of the classifier for each patch. The patch-wise classifier can learn more relevant features of objects from the patches containing various parts of them. In addition, background patches for weakly-supervised learning can be easily prepared. Experiments using the visual object tracking challenge data set showed that the patch-wise weakly supervised approach is effective in the object localization in video.\",\"PeriodicalId\":273383,\"journal\":{\"name\":\"2019 International Conference on Artificial Intelligence in Information and Communication (ICAIIC)\",\"volume\":\"46 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-02-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 International Conference on Artificial Intelligence in Information and Communication (ICAIIC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICAIIC.2019.8668987\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on Artificial Intelligence in Information and Communication (ICAIIC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICAIIC.2019.8668987","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Patch-wise Weakly Supervised Learning for Object Localization in Video
Object localization in video is to predict the location and image boundaries of objects of interest in sequential scenes. Despite numerous methods being developed for the task, there are still challenging issues, such as labor-intensive data preparation. In this paper, we propose a patch-wise approach with weak supervision to resolve those issues in the object localization. We first train an patch-wise object classifier based on convolutional neural network with simple labeling about object classes, instead of the bounding box annotation. Then, the object regions are estimated using the class activation maps of the classifier for each patch. The patch-wise classifier can learn more relevant features of objects from the patches containing various parts of them. In addition, background patches for weakly-supervised learning can be easily prepared. Experiments using the visual object tracking challenge data set showed that the patch-wise weakly supervised approach is effective in the object localization in video.