High Speed and High Dynamic Range Video with an Event Camera.

IF 20.8 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Henri Rebecq, Rene Ranftl, Vladlen Koltun, Davide Scaramuzza
{"title":"High Speed and High Dynamic Range Video with an Event Camera.","authors":"Henri Rebecq,&nbsp;Rene Ranftl,&nbsp;Vladlen Koltun,&nbsp;Davide Scaramuzza","doi":"10.1109/TPAMI.2019.2963386","DOIUrl":null,"url":null,"abstract":"<p><p>Event cameras are novel sensors that report brightness changes in the form of a stream of asynchronous \"events\" instead of intensity frames. They offer significant advantages with respect to conventional cameras: high temporal resolution, high dynamic range, and no motion blur. While the stream of events encodes in principle the complete visual signal, the reconstruction of an intensity image from a stream of events is an ill-posed problem in practice. Existing reconstruction approaches are based on hand-crafted priors and strong assumptions about the imaging process as well as the statistics of natural images. In this work we propose to learn to reconstruct intensity images from event streams directly from data instead of relying on any hand-crafted priors. We propose a novel recurrent network to reconstruct videos from a stream of events, and train it on a large amount of simulated event data. During training we propose to use a perceptual loss to encourage reconstructions to follow natural image statistics. We further extend our approach to synthesize color images from color event streams. Our quantitative experiments show that our network surpasses state-of-the-art reconstruction methods by a large margin in terms of image quality ( ), while comfortably running in real-time. We show that the network is able to synthesize high framerate videos ( frames per second) of high-speed phenomena (e.g., a bullet hitting an object) and is able to provide high dynamic range reconstructions in challenging lighting conditions. As an additional contribution, we demonstrate the effectiveness of our reconstructions as an intermediate representation for event data. We show that off-the-shelf computer vision algorithms can be applied to our reconstructions for tasks such as object classification and visual-inertial odometry and that this strategy consistently outperforms algorithms that were specifically designed for event data. We release the reconstruction code, a pre-trained model and the datasets to enable further research.</p>","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"43 6","pages":"1964-1980"},"PeriodicalIF":20.8000,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TPAMI.2019.2963386","citationCount":"325","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Pattern Analysis and Machine Intelligence","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1109/TPAMI.2019.2963386","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2021/5/11 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 325

Abstract

Event cameras are novel sensors that report brightness changes in the form of a stream of asynchronous "events" instead of intensity frames. They offer significant advantages with respect to conventional cameras: high temporal resolution, high dynamic range, and no motion blur. While the stream of events encodes in principle the complete visual signal, the reconstruction of an intensity image from a stream of events is an ill-posed problem in practice. Existing reconstruction approaches are based on hand-crafted priors and strong assumptions about the imaging process as well as the statistics of natural images. In this work we propose to learn to reconstruct intensity images from event streams directly from data instead of relying on any hand-crafted priors. We propose a novel recurrent network to reconstruct videos from a stream of events, and train it on a large amount of simulated event data. During training we propose to use a perceptual loss to encourage reconstructions to follow natural image statistics. We further extend our approach to synthesize color images from color event streams. Our quantitative experiments show that our network surpasses state-of-the-art reconstruction methods by a large margin in terms of image quality ( ), while comfortably running in real-time. We show that the network is able to synthesize high framerate videos ( frames per second) of high-speed phenomena (e.g., a bullet hitting an object) and is able to provide high dynamic range reconstructions in challenging lighting conditions. As an additional contribution, we demonstrate the effectiveness of our reconstructions as an intermediate representation for event data. We show that off-the-shelf computer vision algorithms can be applied to our reconstructions for tasks such as object classification and visual-inertial odometry and that this strategy consistently outperforms algorithms that were specifically designed for event data. We release the reconstruction code, a pre-trained model and the datasets to enable further research.

高速和高动态范围视频与事件相机。
事件相机是一种新颖的传感器,它以异步“事件”流的形式报告亮度变化,而不是强度帧。与传统相机相比,它们具有显著的优势:高时间分辨率,高动态范围,无运动模糊。虽然事件流原则上编码了完整的视觉信号,但从事件流中重建强度图像在实践中是一个不适定问题。现有的重建方法是基于手工制作的先验和对成像过程的强假设以及自然图像的统计。在这项工作中,我们建议学习直接从数据中重建事件流的强度图像,而不是依赖于任何手工制作的先验。我们提出了一种新的循环网络来从事件流中重构视频,并在大量的模拟事件数据上对其进行训练。在训练过程中,我们建议使用感知损失来鼓励重建遵循自然图像统计。我们进一步扩展了从彩色事件流合成彩色图像的方法。我们的定量实验表明,我们的网络在图像质量()方面大大超过了最先进的重建方法,同时可以舒适地实时运行。我们表明,该网络能够合成高速现象(例如,子弹击中物体)的高帧率视频(每秒帧数),并能够在具有挑战性的照明条件下提供高动态范围重建。作为额外的贡献,我们证明了我们的重建作为事件数据的中间表示的有效性。我们表明,现成的计算机视觉算法可以应用于我们的任务重建,如对象分类和视觉惯性里程计,并且这种策略始终优于专门为事件数据设计的算法。我们发布了重建代码、预训练模型和数据集,以便进一步研究。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
28.40
自引率
3.00%
发文量
885
审稿时长
8.5 months
期刊介绍: The IEEE Transactions on Pattern Analysis and Machine Intelligence publishes articles on all traditional areas of computer vision and image understanding, all traditional areas of pattern analysis and recognition, and selected areas of machine intelligence, with a particular emphasis on machine learning for pattern analysis. Areas such as techniques for visual search, document and handwriting analysis, medical image analysis, video and image sequence analysis, content-based retrieval of image and video, face and gesture recognition and relevant specialized hardware and/or software architectures are also covered.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信