E2VIDX:传统视觉和仿生视觉之间的桥梁。

IF 2.6 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Frontiers in Neurorobotics Pub Date : 2023-10-26 eCollection Date: 2023-01-01 DOI:10.3389/fnbot.2023.1277160

Xujia Hou, Feihu Zhang, Dhiraj Gulati, Tingfeng Tan, Wei Zhang

{"title":"E2VIDX:传统视觉和仿生视觉之间的桥梁。","authors":"Xujia Hou, Feihu Zhang, Dhiraj Gulati, Tingfeng Tan, Wei Zhang","doi":"10.3389/fnbot.2023.1277160","DOIUrl":null,"url":null,"abstract":"Common RGBD, CMOS, and CCD-based cameras produce motion blur and incorrect exposure under high-speed and improper lighting conditions. According to the bionic principle, the event camera developed has the advantages of low delay, high dynamic range, and no motion blur. However, due to its unique data representation, it encounters significant obstacles in practical applications. The image reconstruction algorithm based on an event camera solves the problem by converting a series of \"events\" into common frames to apply existing vision algorithms. Due to the rapid development of neural networks, this field has made significant breakthroughs in past few years. Based on the most popular Events-to-Video (E2VID) method, this study designs a new network called E2VIDX. The proposed network includes group convolution and sub-pixel convolution, which not only achieves better feature fusion but also the network model size is reduced by 25%. Futhermore, we propose a new loss function. The loss function is divided into two parts, first part calculates the high level features and the second part calculates the low level features of the reconstructed image. The experimental results clearly outperform against the state-of-the-art method. Compared with the original method, Structural Similarity (SSIM) increases by 1.3%, Learned Perceptual Image Patch Similarity (LPIPS) decreases by 1.7%, Mean Squared Error (MSE) decreases by 2.5%, and it runs faster on GPU and CPU. Additionally, we evaluate the results of E2VIDX with application to image classification, object detection, and instance segmentation. The experiments show that conversions using our method can help event cameras directly apply existing vision algorithms in most scenarios.","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"17 ","pages":"1277160"},"PeriodicalIF":2.6000,"publicationDate":"2023-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10639115/pdf/","citationCount":"0","resultStr":"{\"title\":\"E2VIDX: improved bridge between conventional vision and bionic vision.\",\"authors\":\"Xujia Hou, Feihu Zhang, Dhiraj Gulati, Tingfeng Tan, Wei Zhang\",\"doi\":\"10.3389/fnbot.2023.1277160\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Common RGBD, CMOS, and CCD-based cameras produce motion blur and incorrect exposure under high-speed and improper lighting conditions. According to the bionic principle, the event camera developed has the advantages of low delay, high dynamic range, and no motion blur. However, due to its unique data representation, it encounters significant obstacles in practical applications. The image reconstruction algorithm based on an event camera solves the problem by converting a series of \\\"events\\\" into common frames to apply existing vision algorithms. Due to the rapid development of neural networks, this field has made significant breakthroughs in past few years. Based on the most popular Events-to-Video (E2VID) method, this study designs a new network called E2VIDX. The proposed network includes group convolution and sub-pixel convolution, which not only achieves better feature fusion but also the network model size is reduced by 25%. Futhermore, we propose a new loss function. The loss function is divided into two parts, first part calculates the high level features and the second part calculates the low level features of the reconstructed image. The experimental results clearly outperform against the state-of-the-art method. Compared with the original method, Structural Similarity (SSIM) increases by 1.3%, Learned Perceptual Image Patch Similarity (LPIPS) decreases by 1.7%, Mean Squared Error (MSE) decreases by 2.5%, and it runs faster on GPU and CPU. Additionally, we evaluate the results of E2VIDX with application to image classification, object detection, and instance segmentation. The experiments show that conversions using our method can help event cameras directly apply existing vision algorithms in most scenarios.\",\"PeriodicalId\":12628,\"journal\":{\"name\":\"Frontiers in Neurorobotics\",\"volume\":\"17 \",\"pages\":\"1277160\"},\"PeriodicalIF\":2.6000,\"publicationDate\":\"2023-10-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10639115/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Frontiers in Neurorobotics\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.3389/fnbot.2023.1277160\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2023/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Neurorobotics","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.3389/fnbot.2023.1277160","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/1/1 0:00:00","PubModel":"eCollection","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

常见的RGBD、CMOS和ccd相机在高速和不适当的照明条件下会产生运动模糊和不正确的曝光。根据仿生学原理研制的事件摄像机具有低延迟、高动态范围、无运动模糊等优点。然而，由于其独特的数据表示，它在实际应用中遇到了很大的障碍。基于事件相机的图像重建算法通过将一系列“事件”转换为通用帧来应用现有的视觉算法来解决这个问题。由于神经网络的快速发展，这一领域在过去几年中取得了重大突破。基于最流行的事件到视频(E2VID)方法，本研究设计了一个名为E2VIDX的新网络。该网络包括群卷积和亚像素卷积，不仅实现了更好的特征融合，而且网络模型尺寸减小了25%。在此基础上，提出了一种新的损失函数。损失函数分为两部分，第一部分计算重构图像的高阶特征，第二部分计算重构图像的低阶特征。实验结果明显优于最先进的方法。与原始方法相比，结构相似度(SSIM)提高了1.3%，学习感知图像补丁相似度(LPIPS)降低了1.7%，均方误差(MSE)降低了2.5%，并且在GPU和CPU上的运行速度更快。此外，我们还评估了E2VIDX在图像分类、目标检测和实例分割方面的应用结果。实验表明，我们的方法可以帮助事件相机在大多数场景下直接应用现有的视觉算法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

E2VIDX: improved bridge between conventional vision and bionic vision.

Common RGBD, CMOS, and CCD-based cameras produce motion blur and incorrect exposure under high-speed and improper lighting conditions. According to the bionic principle, the event camera developed has the advantages of low delay, high dynamic range, and no motion blur. However, due to its unique data representation, it encounters significant obstacles in practical applications. The image reconstruction algorithm based on an event camera solves the problem by converting a series of "events" into common frames to apply existing vision algorithms. Due to the rapid development of neural networks, this field has made significant breakthroughs in past few years. Based on the most popular Events-to-Video (E2VID) method, this study designs a new network called E2VIDX. The proposed network includes group convolution and sub-pixel convolution, which not only achieves better feature fusion but also the network model size is reduced by 25%. Futhermore, we propose a new loss function. The loss function is divided into two parts, first part calculates the high level features and the second part calculates the low level features of the reconstructed image. The experimental results clearly outperform against the state-of-the-art method. Compared with the original method, Structural Similarity (SSIM) increases by 1.3%, Learned Perceptual Image Patch Similarity (LPIPS) decreases by 1.7%, Mean Squared Error (MSE) decreases by 2.5%, and it runs faster on GPU and CPU. Additionally, we evaluate the results of E2VIDX with application to image classification, object detection, and instance segmentation. The experiments show that conversions using our method can help event cameras directly apply existing vision algorithms in most scenarios.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Frontiers in Neurorobotics COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCER-ROBOTICS

CiteScore

5.20

自引率

6.50%

发文量

250

审稿时长

14 weeks

期刊介绍： Frontiers in Neurorobotics publishes rigorously peer-reviewed research in the science and technology of embodied autonomous neural systems. Specialty Chief Editors Alois C. Knoll and Florian Röhrbein at the Technische Universität München are supported by an outstanding Editorial Board of international experts. This multidisciplinary open-access journal is at the forefront of disseminating and communicating scientific knowledge and impactful discoveries to researchers, academics and the public worldwide. Neural systems include brain-inspired algorithms (e.g. connectionist networks), computational models of biological neural networks (e.g. artificial spiking neural nets, large-scale simulations of neural microcircuits) and actual biological systems (e.g. in vivo and in vitro neural nets). The focus of the journal is the embodiment of such neural systems in artificial software and hardware devices, machines, robots or any other form of physical actuation. This also includes prosthetic devices, brain machine interfaces, wearable systems, micro-machines, furniture, home appliances, as well as systems for managing micro and macro infrastructures. Frontiers in Neurorobotics also aims to publish radically new tools and methods to study plasticity and development of autonomous self-learning systems that are capable of acquiring knowledge in an open-ended manner. Models complemented with experimental studies revealing self-organizing principles of embodied neural systems are welcome. Our journal also publishes on the micro and macro engineering and mechatronics of robotic devices driven by neural systems, as well as studies on the impact that such systems will have on our daily life.