Continuous-time Object Segmentation using High Temporal Resolution Event Camera.

IEEE transactions on pattern analysis and machine intelligence Pub Date : 2024-10-10 DOI:10.1109/TPAMI.2024.3477591

Lin Zhu, Xianzhang Chen, Lizhi Wang, Xiao Wang, Yonghong Tian, Hua Huang

{"title":"Continuous-time Object Segmentation using High Temporal Resolution Event Camera.","authors":"Lin Zhu, Xianzhang Chen, Lizhi Wang, Xiao Wang, Yonghong Tian, Hua Huang","doi":"10.1109/TPAMI.2024.3477591","DOIUrl":null,"url":null,"abstract":"<p><p>Event cameras are novel bio-inspired sensors, where individual pixels operate independently and asynchronously, generating intensity changes as events. Leveraging the microsecond resolution (no motion blur) and high dynamic range (compatible with extreme light conditions) of events, there is considerable promise in directly segmenting objects from sparse and asynchronous event streams in various applications. However, different from the rich cues in video object segmentation, it is challenging to segment complete objects from the sparse event stream. In this paper, we present the first framework for continuous-time object segmentation from event stream. Given the object mask at the initial time, our task aims to segment the complete object at any subsequent time in event streams. Specifically, our framework consists of a Recurrent Temporal Embedding Extraction (RTEE) module based on a novel ResLSTM, a Cross-time Spatiotemporal Feature Modeling (CSFM) module which is a transformer architecture with long-term and short-term matching modules, and a segmentation head. The historical events and masks (reference sets) are recurrently fed into our framework along with current-time events. The temporal embedding is updated as new events are input, enabling our framework to continuously process the event stream. To train and test our model, we construct both real-world and simulated event-based object segmentation datasets, each comprising event streams, APS images, and object annotations. Extensive experiments on our datasets demonstrate the effectiveness of the proposed recurrent architecture. Our code and dataset are available at https://sites.google.com/view/ecos-net/.</p>","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"PP ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on pattern analysis and machine intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TPAMI.2024.3477591","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Event cameras are novel bio-inspired sensors, where individual pixels operate independently and asynchronously, generating intensity changes as events. Leveraging the microsecond resolution (no motion blur) and high dynamic range (compatible with extreme light conditions) of events, there is considerable promise in directly segmenting objects from sparse and asynchronous event streams in various applications. However, different from the rich cues in video object segmentation, it is challenging to segment complete objects from the sparse event stream. In this paper, we present the first framework for continuous-time object segmentation from event stream. Given the object mask at the initial time, our task aims to segment the complete object at any subsequent time in event streams. Specifically, our framework consists of a Recurrent Temporal Embedding Extraction (RTEE) module based on a novel ResLSTM, a Cross-time Spatiotemporal Feature Modeling (CSFM) module which is a transformer architecture with long-term and short-term matching modules, and a segmentation head. The historical events and masks (reference sets) are recurrently fed into our framework along with current-time events. The temporal embedding is updated as new events are input, enabling our framework to continuously process the event stream. To train and test our model, we construct both real-world and simulated event-based object segmentation datasets, each comprising event streams, APS images, and object annotations. Extensive experiments on our datasets demonstrate the effectiveness of the proposed recurrent architecture. Our code and dataset are available at https://sites.google.com/view/ecos-net/.

查看原文本刊更多论文

使用高时间分辨率事件摄像机进行连续时间物体分割。

事件相机是一种新颖的生物启发传感器，单个像素独立异步工作，产生强度变化作为事件。利用事件的微秒级分辨率（无运动模糊）和高动态范围（与极端光线条件兼容），在各种应用中直接从稀疏和异步事件流中分割对象大有可为。然而，与视频对象分割中的丰富线索不同，从稀疏事件流中分割完整的对象具有挑战性。在本文中，我们首次提出了从事件流中进行连续时间对象分割的框架。鉴于初始时间的对象掩码，我们的任务旨在分割事件流中任意后续时间的完整对象。具体来说，我们的框架由一个基于新型 ResLSTM 的递归时空嵌入提取（RTEE）模块、一个跨时时空特征建模（CSFM）模块（这是一个包含长期和短期匹配模块的转换器架构）和一个分割头组成。历史事件和掩码（参考集）与当前时间事件一起循环输入到我们的框架中。随着新事件的输入，时间嵌入也会随之更新，从而使我们的框架能够持续处理事件流。为了训练和测试我们的模型，我们构建了真实世界和模拟基于事件的物体分割数据集，每个数据集都包含事件流、APS 图像和物体注释。在我们的数据集上进行的大量实验证明了所提出的循环架构的有效性。我们的代码和数据集可在 https://sites.google.com/view/ecos-net/ 上获取。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE transactions on pattern analysis and machine intelligence

自引率

0.00%

发文量