帧视频源和事件视频源的异步强度表示

Proceedings of the 14th Conference on ACM Multimedia Systems Pub Date : 2023-01-20 DOI:10.1145/3587819.3590969

Andrew C. Freeman, Montek Singh, Ketan Mayer-Patel

{"title":"帧视频源和事件视频源的异步强度表示","authors":"Andrew C. Freeman, Montek Singh, Ketan Mayer-Patel","doi":"10.1145/3587819.3590969","DOIUrl":null,"url":null,"abstract":"Neuromorphic \"event\" cameras, designed to mimic the human vision system with asynchronous sensing, unlock a new realm of high-speed and high-dynamic-range applications. However, researchers often either revert to a framed representation of event data for applications, or build bespoke applications for a particular camera's event data type. To usher in the next era of video systems, accommodate new event camera designs, and explore the benefits of asynchronous video in classical applications, we argue that there is a need for an asynchronous, source-agnostic video representation. In this paper, we introduce a novel, asynchronous intensity representation for both framed and non-framed data sources. We show that our representation can increase intensity precision and greatly reduce the number of samples per pixel compared to grid-based representations. With framed sources, we demonstrate that by permitting a small amount of loss through the temporal averaging of stable pixel values, we can reduce our representational sample rate by more than half, while incurring a drop in VMAF quality score of only 4.5. We also demonstrate lower latency than the state-of-the-art method for fusing and transcoding framed and event camera data to an intensity representation, while maintaining 2000X the temporal resolution. We argue that our method provides the computational efficiency and temporal granularity necessary to build real-time intensity-based applications for event video.","PeriodicalId":330983,"journal":{"name":"Proceedings of the 14th Conference on ACM Multimedia Systems","volume":"17 1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"An Asynchronous Intensity Representation for Framed and Event Video Sources\",\"authors\":\"Andrew C. Freeman, Montek Singh, Ketan Mayer-Patel\",\"doi\":\"10.1145/3587819.3590969\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Neuromorphic \\\"event\\\" cameras, designed to mimic the human vision system with asynchronous sensing, unlock a new realm of high-speed and high-dynamic-range applications. However, researchers often either revert to a framed representation of event data for applications, or build bespoke applications for a particular camera's event data type. To usher in the next era of video systems, accommodate new event camera designs, and explore the benefits of asynchronous video in classical applications, we argue that there is a need for an asynchronous, source-agnostic video representation. In this paper, we introduce a novel, asynchronous intensity representation for both framed and non-framed data sources. We show that our representation can increase intensity precision and greatly reduce the number of samples per pixel compared to grid-based representations. With framed sources, we demonstrate that by permitting a small amount of loss through the temporal averaging of stable pixel values, we can reduce our representational sample rate by more than half, while incurring a drop in VMAF quality score of only 4.5. We also demonstrate lower latency than the state-of-the-art method for fusing and transcoding framed and event camera data to an intensity representation, while maintaining 2000X the temporal resolution. We argue that our method provides the computational efficiency and temporal granularity necessary to build real-time intensity-based applications for event video.\",\"PeriodicalId\":330983,\"journal\":{\"name\":\"Proceedings of the 14th Conference on ACM Multimedia Systems\",\"volume\":\"17 1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-01-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 14th Conference on ACM Multimedia Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3587819.3590969\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 14th Conference on ACM Multimedia Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3587819.3590969","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

神经形态“事件”相机，旨在模仿人类视觉系统的异步传感，解锁高速和高动态范围应用的新领域。然而，研究人员经常为应用程序恢复到事件数据的框架表示，或者为特定相机的事件数据类型构建定制的应用程序。为了迎接视频系统的下一个时代，适应新的事件摄像机设计，并探索异步视频在经典应用中的好处，我们认为需要一种异步的、与源无关的视频表示。在本文中，我们为框架和非框架数据源引入了一种新的异步强度表示。我们表明，与基于网格的表示相比，我们的表示可以提高强度精度，并大大减少每像素的样本数量。对于框架源，我们证明，通过稳定像素值的时间平均允许少量损失，我们可以将代表性样本率降低一半以上，同时导致VMAF质量分数仅下降4.5。我们还展示了比最先进的方法更低的延迟，用于将帧和事件相机数据融合和转码到强度表示，同时保持2000倍的时间分辨率。我们认为我们的方法为构建基于实时强度的事件视频应用程序提供了必要的计算效率和时间粒度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

An Asynchronous Intensity Representation for Framed and Event Video Sources

Neuromorphic "event" cameras, designed to mimic the human vision system with asynchronous sensing, unlock a new realm of high-speed and high-dynamic-range applications. However, researchers often either revert to a framed representation of event data for applications, or build bespoke applications for a particular camera's event data type. To usher in the next era of video systems, accommodate new event camera designs, and explore the benefits of asynchronous video in classical applications, we argue that there is a need for an asynchronous, source-agnostic video representation. In this paper, we introduce a novel, asynchronous intensity representation for both framed and non-framed data sources. We show that our representation can increase intensity precision and greatly reduce the number of samples per pixel compared to grid-based representations. With framed sources, we demonstrate that by permitting a small amount of loss through the temporal averaging of stable pixel values, we can reduce our representational sample rate by more than half, while incurring a drop in VMAF quality score of only 4.5. We also demonstrate lower latency than the state-of-the-art method for fusing and transcoding framed and event camera data to an intensity representation, while maintaining 2000X the temporal resolution. We argue that our method provides the computational efficiency and temporal granularity necessary to build real-time intensity-based applications for event video.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 14th Conference on ACM Multimedia Systems

自引率

0.00%

发文量