Shanmuga Venkatachalam, V. Vivekanand, R. Kubendran
{"title":"事件框架:立体深度图的低延迟资源效率方法","authors":"Shanmuga Venkatachalam, V. Vivekanand, R. Kubendran","doi":"10.1109/ICARA56516.2023.10125817","DOIUrl":null,"url":null,"abstract":"Computer vision traditionally uses cameras that capture visual information as frames at periodic intervals. On the other hand, Dynamic Vision Sensors (DVS) capture temporal contrast (TC) in each pixel asynchronously and stream them serially. This paper proposes a hybrid approach to generate input visual data as ‘frame of events’ for a stereo vision pipeline. We demonstrate that using hybrid vision sensors that produce frames made up of TC events can achieve superior results in terms of low latency, less compute and low memory footprint as compared to the traditional cameras and the event-based DVS. The frame-of-events approach eliminates the latency and memory resources involved in the accumulation of asynchronous events into synchronous frames, while generating acceptable disparity maps for depth estimation. Benchmarking results show that the frame-of-events pipeline outperforms others with the least average latency per frame of 3.8 ms and least average memory usage per frame of 112.4 Kb, which amounts to 7.32% and 9.75% reduction when compared to traditional frame-based pipeline. Hence, the proposed method is suitable for missioncritical robotics applications that involve path planning and localization mapping in a resource-constrained environment, such as drone navigation and autonomous vehicles.","PeriodicalId":443572,"journal":{"name":"2023 9th International Conference on Automation, Robotics and Applications (ICARA)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Frame of Events: A Low-latency Resource-efficient Approach for Stereo Depth Maps\",\"authors\":\"Shanmuga Venkatachalam, V. Vivekanand, R. Kubendran\",\"doi\":\"10.1109/ICARA56516.2023.10125817\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Computer vision traditionally uses cameras that capture visual information as frames at periodic intervals. On the other hand, Dynamic Vision Sensors (DVS) capture temporal contrast (TC) in each pixel asynchronously and stream them serially. This paper proposes a hybrid approach to generate input visual data as ‘frame of events’ for a stereo vision pipeline. We demonstrate that using hybrid vision sensors that produce frames made up of TC events can achieve superior results in terms of low latency, less compute and low memory footprint as compared to the traditional cameras and the event-based DVS. The frame-of-events approach eliminates the latency and memory resources involved in the accumulation of asynchronous events into synchronous frames, while generating acceptable disparity maps for depth estimation. Benchmarking results show that the frame-of-events pipeline outperforms others with the least average latency per frame of 3.8 ms and least average memory usage per frame of 112.4 Kb, which amounts to 7.32% and 9.75% reduction when compared to traditional frame-based pipeline. Hence, the proposed method is suitable for missioncritical robotics applications that involve path planning and localization mapping in a resource-constrained environment, such as drone navigation and autonomous vehicles.\",\"PeriodicalId\":443572,\"journal\":{\"name\":\"2023 9th International Conference on Automation, Robotics and Applications (ICARA)\",\"volume\":\"41 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-02-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 9th International Conference on Automation, Robotics and Applications (ICARA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICARA56516.2023.10125817\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 9th International Conference on Automation, Robotics and Applications (ICARA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICARA56516.2023.10125817","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Frame of Events: A Low-latency Resource-efficient Approach for Stereo Depth Maps
Computer vision traditionally uses cameras that capture visual information as frames at periodic intervals. On the other hand, Dynamic Vision Sensors (DVS) capture temporal contrast (TC) in each pixel asynchronously and stream them serially. This paper proposes a hybrid approach to generate input visual data as ‘frame of events’ for a stereo vision pipeline. We demonstrate that using hybrid vision sensors that produce frames made up of TC events can achieve superior results in terms of low latency, less compute and low memory footprint as compared to the traditional cameras and the event-based DVS. The frame-of-events approach eliminates the latency and memory resources involved in the accumulation of asynchronous events into synchronous frames, while generating acceptable disparity maps for depth estimation. Benchmarking results show that the frame-of-events pipeline outperforms others with the least average latency per frame of 3.8 ms and least average memory usage per frame of 112.4 Kb, which amounts to 7.32% and 9.75% reduction when compared to traditional frame-based pipeline. Hence, the proposed method is suitable for missioncritical robotics applications that involve path planning and localization mapping in a resource-constrained environment, such as drone navigation and autonomous vehicles.