Deep generative cross-modal on-body accelerometer data synthesis from videos

Adjunct Proceedings of the 2020 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2020 ACM International Symposium on Wearable Computers Pub Date : 2020-09-10 DOI:10.1145/3410530.3414329

Shibo Zhang, N. Alshurafa

{"title":"Deep generative cross-modal on-body accelerometer data synthesis from videos","authors":"Shibo Zhang, N. Alshurafa","doi":"10.1145/3410530.3414329","DOIUrl":null,"url":null,"abstract":"Human activity recognition (HAR) based on wearable sensors has brought tremendous benefit to several industries ranging from healthcare to entertainment. However, to build reliable machine-learned models from wearables, labeled on-body sensor datasets obtained from real-world settings are needed. It is often prohibitively expensive to obtain large-scale, labeled on-body sensor datasets from real-world deployments. The lack of labeled datasets is a major obstacle in the wearable sensor-based activity recognition community. To overcome this problem, I aim to develop two deep generative cross-modal architectures to synthesize accelerometer data streams from video data streams. In the proposed approach, a conditional generative adversarial network (cGAN) is first used to generate sensor data conditioned on video data. Then, a conditional variational autoencoder (cVAE)-cGAN is proposed to further improve representation of the data. The effectiveness and efficacy of the proposed methods will be evaluated through two popular applications in HAR: eating recognition and physical activity recognition. Extensive experiments will be conducted on public sensor-based activity recognition datasets by building models with synthetic data and comparing the models against those trained from real sensor data. This work aims to expand labeled on-body sensor data, by generating synthetic on-body sensor data from video, which will equip the community with methods to transfer labels from video to on-body sensors.","PeriodicalId":7183,"journal":{"name":"Adjunct Proceedings of the 2020 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2020 ACM International Symposium on Wearable Computers","volume":"16 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2020-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Adjunct Proceedings of the 2020 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2020 ACM International Symposium on Wearable Computers","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3410530.3414329","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 15

Abstract

Human activity recognition (HAR) based on wearable sensors has brought tremendous benefit to several industries ranging from healthcare to entertainment. However, to build reliable machine-learned models from wearables, labeled on-body sensor datasets obtained from real-world settings are needed. It is often prohibitively expensive to obtain large-scale, labeled on-body sensor datasets from real-world deployments. The lack of labeled datasets is a major obstacle in the wearable sensor-based activity recognition community. To overcome this problem, I aim to develop two deep generative cross-modal architectures to synthesize accelerometer data streams from video data streams. In the proposed approach, a conditional generative adversarial network (cGAN) is first used to generate sensor data conditioned on video data. Then, a conditional variational autoencoder (cVAE)-cGAN is proposed to further improve representation of the data. The effectiveness and efficacy of the proposed methods will be evaluated through two popular applications in HAR: eating recognition and physical activity recognition. Extensive experiments will be conducted on public sensor-based activity recognition datasets by building models with synthetic data and comparing the models against those trained from real sensor data. This work aims to expand labeled on-body sensor data, by generating synthetic on-body sensor data from video, which will equip the community with methods to transfer labels from video to on-body sensors.

查看原文本刊更多论文

基于视频的深度生成跨模态车身加速度计数据合成

基于可穿戴传感器的人体活动识别(HAR)已经为从医疗保健到娱乐等多个行业带来了巨大的利益。然而，为了从可穿戴设备中建立可靠的机器学习模型，需要从现实环境中获得标记的身体传感器数据集。从现实世界的部署中获得大规模的、标记的身体传感器数据集通常是非常昂贵的。缺乏标记数据集是基于可穿戴传感器的活动识别领域的主要障碍。为了克服这个问题，我的目标是开发两个深度生成跨模态架构来从视频数据流合成加速度计数据流。在该方法中，首先使用条件生成对抗网络(cGAN)来生成以视频数据为条件的传感器数据。然后，提出了一种条件变分自编码器(cVAE)-cGAN来进一步改善数据的表示。提出的方法的有效性和功效将通过HAR中两个流行的应用进行评估:饮食识别和身体活动识别。将在基于公共传感器的活动识别数据集上进行广泛的实验，通过使用合成数据构建模型，并将模型与从真实传感器数据训练的模型进行比较。这项工作旨在通过从视频中生成合成的身体传感器数据来扩展标记的身体传感器数据，这将为社区提供将标签从视频转移到身体传感器的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Adjunct Proceedings of the 2020 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2020 ACM International Symposium on Wearable Computers

自引率

0.00%

发文量