Human Silhouette and Skeleton Video Synthesis Through Wi-Fi Signals.

IF 6.4 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

International Journal of Neural Systems Pub Date : 2022-05-01 Epub Date: 2022-02-24 DOI:10.1142/S0129065722500150

Danilo Avola, Marco Cascio, Luigi Cinque, Alessio Fagioli, Gian Luca Foresti

{"title":"Human Silhouette and Skeleton Video Synthesis Through Wi-Fi Signals.","authors":"Danilo Avola, Marco Cascio, Luigi Cinque, Alessio Fagioli, Gian Luca Foresti","doi":"10.1142/S0129065722500150","DOIUrl":null,"url":null,"abstract":"<p><p>The increasing availability of wireless access points (APs) is leading toward human sensing applications based on Wi-Fi signals as support or alternative tools to the widespread visual sensors, where the signals enable to address well-known vision-related problems such as illumination changes or occlusions. Indeed, using image synthesis techniques to translate radio frequencies to the visible spectrum can become essential to obtain otherwise unavailable visual data. This domain-to-domain translation is feasible because both objects and people affect electromagnetic waves, causing radio and optical frequencies variations. In the literature, models capable of inferring radio-to-visual features mappings have gained momentum in the last few years since frequency changes can be observed in the radio domain through the channel state information (CSI) of Wi-Fi APs, enabling signal-based feature extraction, e.g. amplitude. On this account, this paper presents a novel two-branch generative neural network that effectively maps radio data into visual features, following a teacher-student design that exploits a cross-modality supervision strategy. The latter conditions signal-based features in the visual domain to completely replace visual data. Once trained, the proposed method synthesizes human silhouette and skeleton videos using exclusively Wi-Fi signals. The approach is evaluated on publicly available data, where it obtains remarkable results for both silhouette and skeleton videos generation, demonstrating the effectiveness of the proposed cross-modality supervision strategy.</p>","PeriodicalId":50305,"journal":{"name":"International Journal of Neural Systems","volume":"32 5","pages":"2250015"},"PeriodicalIF":6.4000,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Neural Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1142/S0129065722500150","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2022/2/24 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 4

Abstract

The increasing availability of wireless access points (APs) is leading toward human sensing applications based on Wi-Fi signals as support or alternative tools to the widespread visual sensors, where the signals enable to address well-known vision-related problems such as illumination changes or occlusions. Indeed, using image synthesis techniques to translate radio frequencies to the visible spectrum can become essential to obtain otherwise unavailable visual data. This domain-to-domain translation is feasible because both objects and people affect electromagnetic waves, causing radio and optical frequencies variations. In the literature, models capable of inferring radio-to-visual features mappings have gained momentum in the last few years since frequency changes can be observed in the radio domain through the channel state information (CSI) of Wi-Fi APs, enabling signal-based feature extraction, e.g. amplitude. On this account, this paper presents a novel two-branch generative neural network that effectively maps radio data into visual features, following a teacher-student design that exploits a cross-modality supervision strategy. The latter conditions signal-based features in the visual domain to completely replace visual data. Once trained, the proposed method synthesizes human silhouette and skeleton videos using exclusively Wi-Fi signals. The approach is evaluated on publicly available data, where it obtains remarkable results for both silhouette and skeleton videos generation, demonstrating the effectiveness of the proposed cross-modality supervision strategy.

查看原文本刊更多论文

通过Wi-Fi信号合成人体剪影和骨骼视频。

无线接入点(ap)的日益普及正在引领基于Wi-Fi信号的人类传感应用，作为广泛使用的视觉传感器的支持或替代工具，其中信号能够解决众所周知的视觉相关问题，如照明变化或遮挡。事实上，使用图像合成技术将无线电频率转换为可见光谱对于获得否则无法获得的视觉数据至关重要。这种域到域的转换是可行的，因为物体和人都会影响电磁波，导致无线电和光学频率的变化。在文献中，能够推断无线电到视觉特征映射的模型在过去几年中获得了发展势头，因为可以通过Wi-Fi ap的信道状态信息(CSI)在无线电域中观察到频率变化，从而实现基于信号的特征提取，例如振幅。鉴于此，本文提出了一种新型的双分支生成神经网络，该网络可以有效地将无线电数据映射到视觉特征中，遵循利用跨模态监督策略的师生设计。后者的条件是基于信号的特征在视觉领域完全取代视觉数据。一旦训练，所提出的方法合成人体轮廓和骨骼视频只使用Wi-Fi信号。该方法在公开可用的数据上进行了评估，在剪影和骨架视频生成方面获得了显着的结果，证明了所提出的跨模态监督策略的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International Journal of Neural Systems 工程技术-计算机：人工智能

CiteScore

11.30

自引率

28.80%

发文量

116

审稿时长

24 months

期刊介绍： The International Journal of Neural Systems is a monthly, rigorously peer-reviewed transdisciplinary journal focusing on information processing in both natural and artificial neural systems. Special interests include machine learning, computational neuroscience and neurology. The journal prioritizes innovative, high-impact articles spanning multiple fields, including neurosciences and computer science and engineering. It adopts an open-minded approach to this multidisciplinary field, serving as a platform for novel ideas and enhanced understanding of collective and cooperative phenomena in computationally capable systems.