EventGAN:利用事件相机的大规模图像数据集

2021 IEEE International Conference on Computational Photography (ICCP) Pub Date : 2019-12-03 DOI:10.1109/ICCP51581.2021.9466265

A. Z. Zhu, ZiYun Wang, Kaung Khant, Kostas Daniilidis

{"title":"EventGAN:利用事件相机的大规模图像数据集","authors":"A. Z. Zhu, ZiYun Wang, Kaung Khant, Kostas Daniilidis","doi":"10.1109/ICCP51581.2021.9466265","DOIUrl":null,"url":null,"abstract":"Event cameras provide a number of benefits over traditional cameras, such as the ability to track incredibly fast motions, high dynamic range, and low power consumption. However, their application into computer vision problems, many of which are primarily dominated by deep learning solutions, has been limited by the lack of labeled training data for events. In this work, we propose a method which leverages the existing labeled data for images by simulating events from a pair of temporal image frames, using a convolutional neural network. We train this network on pairs of images and events, using an adversarial discriminator loss and a pair of cycle consistency losses. The cycle consistency losses utilize a pair of pre-trained self-supervised networks which perform optical flow estimation and image reconstruction from events, and constrain our network to generate events which result in accurate outputs from both of these networks. Trained fully end to end, our network learns a generative model for events from images without the need for accurate modeling of the motion in the scene, exhibited by modeling based methods, while also implicitly modeling event noise. Using this simulator, we train a pair of downstream networks on object detection and 2D human pose estimation from events, using simulated data from large scale image datasets, and demonstrate the networks' abilities to generalize to datasets with real events. The code and dataset in this paper are available here: https://github.com/alexzzhu/EventGAN.","PeriodicalId":132124,"journal":{"name":"2021 IEEE International Conference on Computational Photography (ICCP)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"27","resultStr":"{\"title\":\"EventGAN: Leveraging Large Scale Image Datasets for Event Cameras\",\"authors\":\"A. Z. Zhu, ZiYun Wang, Kaung Khant, Kostas Daniilidis\",\"doi\":\"10.1109/ICCP51581.2021.9466265\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Event cameras provide a number of benefits over traditional cameras, such as the ability to track incredibly fast motions, high dynamic range, and low power consumption. However, their application into computer vision problems, many of which are primarily dominated by deep learning solutions, has been limited by the lack of labeled training data for events. In this work, we propose a method which leverages the existing labeled data for images by simulating events from a pair of temporal image frames, using a convolutional neural network. We train this network on pairs of images and events, using an adversarial discriminator loss and a pair of cycle consistency losses. The cycle consistency losses utilize a pair of pre-trained self-supervised networks which perform optical flow estimation and image reconstruction from events, and constrain our network to generate events which result in accurate outputs from both of these networks. Trained fully end to end, our network learns a generative model for events from images without the need for accurate modeling of the motion in the scene, exhibited by modeling based methods, while also implicitly modeling event noise. Using this simulator, we train a pair of downstream networks on object detection and 2D human pose estimation from events, using simulated data from large scale image datasets, and demonstrate the networks' abilities to generalize to datasets with real events. The code and dataset in this paper are available here: https://github.com/alexzzhu/EventGAN.\",\"PeriodicalId\":132124,\"journal\":{\"name\":\"2021 IEEE International Conference on Computational Photography (ICCP)\",\"volume\":\"4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-12-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"27\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE International Conference on Computational Photography (ICCP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCP51581.2021.9466265\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Computational Photography (ICCP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCP51581.2021.9466265","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 27

摘要

与传统摄像机相比，事件摄像机提供了许多优点，例如能够跟踪令人难以置信的快速运动，高动态范围和低功耗。然而，它们在计算机视觉问题中的应用，其中许多主要由深度学习解决方案主导，由于缺乏标记事件的训练数据而受到限制。在这项工作中，我们提出了一种方法，该方法通过使用卷积神经网络模拟一对时间图像帧中的事件来利用现有的图像标记数据。我们使用对抗性鉴别器损失和一对循环一致性损失对图像和事件对训练该网络。周期一致性损失利用一对预训练的自监督网络，从事件中进行光流估计和图像重建，并约束我们的网络生成事件，从而从这两个网络中产生准确的输出。完全端到端训练，我们的网络从图像中学习事件的生成模型，而不需要对场景中的运动进行精确建模，通过基于建模的方法展示，同时也隐式地对事件噪声进行建模。使用该模拟器，我们使用来自大规模图像数据集的模拟数据训练了一对下游网络，用于对象检测和基于事件的二维人体姿态估计，并演示了网络推广到具有真实事件的数据集的能力。本文中的代码和数据集可在这里获得:https://github.com/alexzzhu/EventGAN。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

EventGAN: Leveraging Large Scale Image Datasets for Event Cameras

Event cameras provide a number of benefits over traditional cameras, such as the ability to track incredibly fast motions, high dynamic range, and low power consumption. However, their application into computer vision problems, many of which are primarily dominated by deep learning solutions, has been limited by the lack of labeled training data for events. In this work, we propose a method which leverages the existing labeled data for images by simulating events from a pair of temporal image frames, using a convolutional neural network. We train this network on pairs of images and events, using an adversarial discriminator loss and a pair of cycle consistency losses. The cycle consistency losses utilize a pair of pre-trained self-supervised networks which perform optical flow estimation and image reconstruction from events, and constrain our network to generate events which result in accurate outputs from both of these networks. Trained fully end to end, our network learns a generative model for events from images without the need for accurate modeling of the motion in the scene, exhibited by modeling based methods, while also implicitly modeling event noise. Using this simulator, we train a pair of downstream networks on object detection and 2D human pose estimation from events, using simulated data from large scale image datasets, and demonstrate the networks' abilities to generalize to datasets with real events. The code and dataset in this paper are available here: https://github.com/alexzzhu/EventGAN.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2021 IEEE International Conference on Computational Photography (ICCP)

自引率

0.00%

发文量