Spiking Depth: Depth estimation from sparse events with spiking neural networks

IF 7.5 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Expert Systems with Applications Pub Date : 2025-10-10 DOI:10.1016/j.eswa.2025.129977

Dongze Liu, Yimeng Fan, Wenrui Lu, Changsong Liu, Wei Zhang

{"title":"Spiking Depth: Depth estimation from sparse events with spiking neural networks","authors":"Dongze Liu, Yimeng Fan, Wenrui Lu, Changsong Liu, Wei Zhang","doi":"10.1016/j.eswa.2025.129977","DOIUrl":null,"url":null,"abstract":"<div><div>Event cameras provide remarkable temporal resolution, wide dynamic range, and low power consumption, making them ideal for depth estimation in high-contrast and dynamic environments. While spiking neural networks (SNNs) are naturally suited to process event data, their performance in depth estimation tasks has not consistently surpassed those of traditional artificial neural networks (ANNs) because of the former’s lack of effective mechanisms for handling the sparse nature of event data. Herein, we propose Spiking Depth, a novel end-to-end SNN framework designed to overcome the limitations of current ANN models and achieve superior depth estimation from sparse event data. In particular, Spiking Depth introduces three key innovations: an event encoding module based on a spiking-driven fusion block (SDFB), enhanced skip connections incorporating both SDFB and an adaptive spiking convolutional block attention module, and the event depth loss that optimizes depth estimation by addressing the sparse and dynamic nature of event data. Spiking Depth outperforms current state-of-the-art SNN and ANN models on two event-based datasets: the Multi Vehicle Stereo Event Camera (MVSEC) dataset, which is a real-world dataset, and a synthetic dataset. On the MVSEC dataset, our model achieves mean depth error values of 11.8 cm, 18.0 cm, and 12.5 cm for Splits 1, 2, and 3, respectively, setting a new benchmark for event-based depth estimation with significantly lower power consumption.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"299 ","pages":"Article 129977"},"PeriodicalIF":7.5000,"publicationDate":"2025-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957417425035924","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Event cameras provide remarkable temporal resolution, wide dynamic range, and low power consumption, making them ideal for depth estimation in high-contrast and dynamic environments. While spiking neural networks (SNNs) are naturally suited to process event data, their performance in depth estimation tasks has not consistently surpassed those of traditional artificial neural networks (ANNs) because of the former’s lack of effective mechanisms for handling the sparse nature of event data. Herein, we propose Spiking Depth, a novel end-to-end SNN framework designed to overcome the limitations of current ANN models and achieve superior depth estimation from sparse event data. In particular, Spiking Depth introduces three key innovations: an event encoding module based on a spiking-driven fusion block (SDFB), enhanced skip connections incorporating both SDFB and an adaptive spiking convolutional block attention module, and the event depth loss that optimizes depth estimation by addressing the sparse and dynamic nature of event data. Spiking Depth outperforms current state-of-the-art SNN and ANN models on two event-based datasets: the Multi Vehicle Stereo Event Camera (MVSEC) dataset, which is a real-world dataset, and a synthetic dataset. On the MVSEC dataset, our model achieves mean depth error values of 11.8 cm, 18.0 cm, and 12.5 cm for Splits 1, 2, and 3, respectively, setting a new benchmark for event-based depth estimation with significantly lower power consumption.

查看原文本刊更多论文

尖峰深度：用尖峰神经网络从稀疏事件中估计深度

事件相机提供卓越的时间分辨率，宽动态范围和低功耗，使其成为高对比度和动态环境中深度估计的理想选择。虽然峰值神经网络（snn）天生就适合处理事件数据，但由于缺乏处理事件数据稀疏性的有效机制，其在深度估计任务中的性能并没有始终超过传统的人工神经网络（ann）。在此，我们提出了一种新的端到端SNN框架Spiking Depth，旨在克服当前ANN模型的局限性，并从稀疏事件数据中获得更好的深度估计。特别是，Spiking Depth引入了三个关键创新：基于Spiking驱动融合块（SDFB）的事件编码模块，结合SDFB和自适应Spiking卷积块注意模块的增强跳过连接，以及通过解决事件数据的稀疏性和动态性来优化深度估计的事件深度损失。Spiking Depth在两个基于事件的数据集上优于当前最先进的SNN和ANN模型：多车辆立体事件相机（MVSEC）数据集，这是一个真实的数据集，以及一个合成数据集。在MVSEC数据集上，我们的模型对分割1、2和3分别实现了11.8 cm、18.0 cm和12.5 cm的平均深度误差值，为基于事件的深度估计设定了一个新的基准，同时显著降低了功耗。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Expert Systems with Applications 工程技术-工程：电子与电气

CiteScore

13.80

自引率

10.60%

发文量

2045

审稿时长

8.7 months

期刊介绍： Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.