高速和高动态范围视频与事件相机。

IF 20.8 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE Transactions on Pattern Analysis and Machine Intelligence Pub Date : 2021-06-01 Epub Date: 2021-05-11 DOI:10.1109/TPAMI.2019.2963386

Henri Rebecq, Rene Ranftl, Vladlen Koltun, Davide Scaramuzza

{"title":"高速和高动态范围视频与事件相机。","authors":"Henri Rebecq, Rene Ranftl, Vladlen Koltun, Davide Scaramuzza","doi":"10.1109/TPAMI.2019.2963386","DOIUrl":null,"url":null,"abstract":"Event cameras are novel sensors that report brightness changes in the form of a stream of asynchronous \"events\" instead of intensity frames. They offer significant advantages with respect to conventional cameras: high temporal resolution, high dynamic range, and no motion blur. While the stream of events encodes in principle the complete visual signal, the reconstruction of an intensity image from a stream of events is an ill-posed problem in practice. Existing reconstruction approaches are based on hand-crafted priors and strong assumptions about the imaging process as well as the statistics of natural images. In this work we propose to learn to reconstruct intensity images from event streams directly from data instead of relying on any hand-crafted priors. We propose a novel recurrent network to reconstruct videos from a stream of events, and train it on a large amount of simulated event data. During training we propose to use a perceptual loss to encourage reconstructions to follow natural image statistics. We further extend our approach to synthesize color images from color event streams. Our quantitative experiments show that our network surpasses state-of-the-art reconstruction methods by a large margin in terms of image quality ( ), while comfortably running in real-time. We show that the network is able to synthesize high framerate videos ( frames per second) of high-speed phenomena (e.g., a bullet hitting an object) and is able to provide high dynamic range reconstructions in challenging lighting conditions. As an additional contribution, we demonstrate the effectiveness of our reconstructions as an intermediate representation for event data. We show that off-the-shelf computer vision algorithms can be applied to our reconstructions for tasks such as object classification and visual-inertial odometry and that this strategy consistently outperforms algorithms that were specifically designed for event data. We release the reconstruction code, a pre-trained model and the datasets to enable further research.","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"43 6","pages":"1964-1980"},"PeriodicalIF":20.8000,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TPAMI.2019.2963386","citationCount":"325","resultStr":"{\"title\":\"High Speed and High Dynamic Range Video with an Event Camera.\",\"authors\":\"Henri Rebecq, Rene Ranftl, Vladlen Koltun, Davide Scaramuzza\",\"doi\":\"10.1109/TPAMI.2019.2963386\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Event cameras are novel sensors that report brightness changes in the form of a stream of asynchronous \\\"events\\\" instead of intensity frames. They offer significant advantages with respect to conventional cameras: high temporal resolution, high dynamic range, and no motion blur. While the stream of events encodes in principle the complete visual signal, the reconstruction of an intensity image from a stream of events is an ill-posed problem in practice. Existing reconstruction approaches are based on hand-crafted priors and strong assumptions about the imaging process as well as the statistics of natural images. In this work we propose to learn to reconstruct intensity images from event streams directly from data instead of relying on any hand-crafted priors. We propose a novel recurrent network to reconstruct videos from a stream of events, and train it on a large amount of simulated event data. During training we propose to use a perceptual loss to encourage reconstructions to follow natural image statistics. We further extend our approach to synthesize color images from color event streams. Our quantitative experiments show that our network surpasses state-of-the-art reconstruction methods by a large margin in terms of image quality ( ), while comfortably running in real-time. We show that the network is able to synthesize high framerate videos ( frames per second) of high-speed phenomena (e.g., a bullet hitting an object) and is able to provide high dynamic range reconstructions in challenging lighting conditions. As an additional contribution, we demonstrate the effectiveness of our reconstructions as an intermediate representation for event data. We show that off-the-shelf computer vision algorithms can be applied to our reconstructions for tasks such as object classification and visual-inertial odometry and that this strategy consistently outperforms algorithms that were specifically designed for event data. We release the reconstruction code, a pre-trained model and the datasets to enable further research.\",\"PeriodicalId\":13426,\"journal\":{\"name\":\"IEEE Transactions on Pattern Analysis and Machine Intelligence\",\"volume\":\"43 6\",\"pages\":\"1964-1980\"},\"PeriodicalIF\":20.8000,\"publicationDate\":\"2021-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1109/TPAMI.2019.2963386\",\"citationCount\":\"325\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Pattern Analysis and Machine Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1109/TPAMI.2019.2963386\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2021/5/11 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Pattern Analysis and Machine Intelligence","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1109/TPAMI.2019.2963386","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2021/5/11 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 325

摘要

事件相机是一种新颖的传感器，它以异步“事件”流的形式报告亮度变化，而不是强度帧。与传统相机相比，它们具有显著的优势:高时间分辨率，高动态范围，无运动模糊。虽然事件流原则上编码了完整的视觉信号，但从事件流中重建强度图像在实践中是一个不适定问题。现有的重建方法是基于手工制作的先验和对成像过程的强假设以及自然图像的统计。在这项工作中，我们建议学习直接从数据中重建事件流的强度图像，而不是依赖于任何手工制作的先验。我们提出了一种新的循环网络来从事件流中重构视频，并在大量的模拟事件数据上对其进行训练。在训练过程中，我们建议使用感知损失来鼓励重建遵循自然图像统计。我们进一步扩展了从彩色事件流合成彩色图像的方法。我们的定量实验表明，我们的网络在图像质量()方面大大超过了最先进的重建方法，同时可以舒适地实时运行。我们表明，该网络能够合成高速现象(例如，子弹击中物体)的高帧率视频(每秒帧数)，并能够在具有挑战性的照明条件下提供高动态范围重建。作为额外的贡献，我们证明了我们的重建作为事件数据的中间表示的有效性。我们表明，现成的计算机视觉算法可以应用于我们的任务重建，如对象分类和视觉惯性里程计，并且这种策略始终优于专门为事件数据设计的算法。我们发布了重建代码、预训练模型和数据集，以便进一步研究。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

High Speed and High Dynamic Range Video with an Event Camera.

Event cameras are novel sensors that report brightness changes in the form of a stream of asynchronous "events" instead of intensity frames. They offer significant advantages with respect to conventional cameras: high temporal resolution, high dynamic range, and no motion blur. While the stream of events encodes in principle the complete visual signal, the reconstruction of an intensity image from a stream of events is an ill-posed problem in practice. Existing reconstruction approaches are based on hand-crafted priors and strong assumptions about the imaging process as well as the statistics of natural images. In this work we propose to learn to reconstruct intensity images from event streams directly from data instead of relying on any hand-crafted priors. We propose a novel recurrent network to reconstruct videos from a stream of events, and train it on a large amount of simulated event data. During training we propose to use a perceptual loss to encourage reconstructions to follow natural image statistics. We further extend our approach to synthesize color images from color event streams. Our quantitative experiments show that our network surpasses state-of-the-art reconstruction methods by a large margin in terms of image quality ( ), while comfortably running in real-time. We show that the network is able to synthesize high framerate videos ( frames per second) of high-speed phenomena (e.g., a bullet hitting an object) and is able to provide high dynamic range reconstructions in challenging lighting conditions. As an additional contribution, we demonstrate the effectiveness of our reconstructions as an intermediate representation for event data. We show that off-the-shelf computer vision algorithms can be applied to our reconstructions for tasks such as object classification and visual-inertial odometry and that this strategy consistently outperforms algorithms that were specifically designed for event data. We release the reconstruction code, a pre-trained model and the datasets to enable further research.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Pattern Analysis and Machine Intelligence 工程技术-工程：电子与电气

CiteScore

28.40

自引率

3.00%

发文量

885

审稿时长

8.5 months

期刊介绍： The IEEE Transactions on Pattern Analysis and Machine Intelligence publishes articles on all traditional areas of computer vision and image understanding, all traditional areas of pattern analysis and recognition, and selected areas of machine intelligence, with a particular emphasis on machine learning for pattern analysis. Areas such as techniques for visual search, document and handwriting analysis, medical image analysis, video and image sequence analysis, content-based retrieval of image and video, face and gesture recognition and relevant specialized hardware and/or software architectures are also covered.