{"title":"弱监督目标检测的眼动追踪数据","authors":"Ching-Hsi Tseng, Yen-Pin Hsu, S. Yuan","doi":"10.1109/ECICE50847.2020.9301923","DOIUrl":null,"url":null,"abstract":"We propose a weakly supervised object detection network based on eye-tracking data. A large number of training samples cannot be used due to the following problems: (1) the labels of training samples in object detection are not all pixel-level and (2) the cost of labeling is too high. Thus, we introduce a framework whose input combines images with only image-level labels and eye-tracking data. Based on the position given by the eye-tracking data, the framework has effective performance even in the case of incomplete sample annotation. Thus, we use an eye-tracker to collect the data on the most interesting area in the sample images and present the data in the fixations way. Then, the bounding boxes produced by the fixations data and the original image-level label become the input data of the object detection network. In this way, eye-tracking data helps us selecting the bounding boxes and providing detailed location information. Experiment results verify that the framework is effective with the support of eye-tracking data.","PeriodicalId":130143,"journal":{"name":"2020 IEEE Eurasia Conference on IOT, Communication and Engineering (ECICE)","volume":"219 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Eye-tracking Data for Weakly Supervised Object Detection\",\"authors\":\"Ching-Hsi Tseng, Yen-Pin Hsu, S. Yuan\",\"doi\":\"10.1109/ECICE50847.2020.9301923\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We propose a weakly supervised object detection network based on eye-tracking data. A large number of training samples cannot be used due to the following problems: (1) the labels of training samples in object detection are not all pixel-level and (2) the cost of labeling is too high. Thus, we introduce a framework whose input combines images with only image-level labels and eye-tracking data. Based on the position given by the eye-tracking data, the framework has effective performance even in the case of incomplete sample annotation. Thus, we use an eye-tracker to collect the data on the most interesting area in the sample images and present the data in the fixations way. Then, the bounding boxes produced by the fixations data and the original image-level label become the input data of the object detection network. In this way, eye-tracking data helps us selecting the bounding boxes and providing detailed location information. Experiment results verify that the framework is effective with the support of eye-tracking data.\",\"PeriodicalId\":130143,\"journal\":{\"name\":\"2020 IEEE Eurasia Conference on IOT, Communication and Engineering (ECICE)\",\"volume\":\"219 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-10-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE Eurasia Conference on IOT, Communication and Engineering (ECICE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ECICE50847.2020.9301923\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE Eurasia Conference on IOT, Communication and Engineering (ECICE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ECICE50847.2020.9301923","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Eye-tracking Data for Weakly Supervised Object Detection
We propose a weakly supervised object detection network based on eye-tracking data. A large number of training samples cannot be used due to the following problems: (1) the labels of training samples in object detection are not all pixel-level and (2) the cost of labeling is too high. Thus, we introduce a framework whose input combines images with only image-level labels and eye-tracking data. Based on the position given by the eye-tracking data, the framework has effective performance even in the case of incomplete sample annotation. Thus, we use an eye-tracker to collect the data on the most interesting area in the sample images and present the data in the fixations way. Then, the bounding boxes produced by the fixations data and the original image-level label become the input data of the object detection network. In this way, eye-tracking data helps us selecting the bounding boxes and providing detailed location information. Experiment results verify that the framework is effective with the support of eye-tracking data.