{"title":"基于视频的深度生成跨模态车身加速度计数据合成","authors":"Shibo Zhang, N. Alshurafa","doi":"10.1145/3410530.3414329","DOIUrl":null,"url":null,"abstract":"Human activity recognition (HAR) based on wearable sensors has brought tremendous benefit to several industries ranging from healthcare to entertainment. However, to build reliable machine-learned models from wearables, labeled on-body sensor datasets obtained from real-world settings are needed. It is often prohibitively expensive to obtain large-scale, labeled on-body sensor datasets from real-world deployments. The lack of labeled datasets is a major obstacle in the wearable sensor-based activity recognition community. To overcome this problem, I aim to develop two deep generative cross-modal architectures to synthesize accelerometer data streams from video data streams. In the proposed approach, a conditional generative adversarial network (cGAN) is first used to generate sensor data conditioned on video data. Then, a conditional variational autoencoder (cVAE)-cGAN is proposed to further improve representation of the data. The effectiveness and efficacy of the proposed methods will be evaluated through two popular applications in HAR: eating recognition and physical activity recognition. Extensive experiments will be conducted on public sensor-based activity recognition datasets by building models with synthetic data and comparing the models against those trained from real sensor data. This work aims to expand labeled on-body sensor data, by generating synthetic on-body sensor data from video, which will equip the community with methods to transfer labels from video to on-body sensors.","PeriodicalId":7183,"journal":{"name":"Adjunct Proceedings of the 2020 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2020 ACM International Symposium on Wearable Computers","volume":"16 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2020-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":"{\"title\":\"Deep generative cross-modal on-body accelerometer data synthesis from videos\",\"authors\":\"Shibo Zhang, N. Alshurafa\",\"doi\":\"10.1145/3410530.3414329\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Human activity recognition (HAR) based on wearable sensors has brought tremendous benefit to several industries ranging from healthcare to entertainment. However, to build reliable machine-learned models from wearables, labeled on-body sensor datasets obtained from real-world settings are needed. It is often prohibitively expensive to obtain large-scale, labeled on-body sensor datasets from real-world deployments. The lack of labeled datasets is a major obstacle in the wearable sensor-based activity recognition community. To overcome this problem, I aim to develop two deep generative cross-modal architectures to synthesize accelerometer data streams from video data streams. In the proposed approach, a conditional generative adversarial network (cGAN) is first used to generate sensor data conditioned on video data. Then, a conditional variational autoencoder (cVAE)-cGAN is proposed to further improve representation of the data. The effectiveness and efficacy of the proposed methods will be evaluated through two popular applications in HAR: eating recognition and physical activity recognition. Extensive experiments will be conducted on public sensor-based activity recognition datasets by building models with synthetic data and comparing the models against those trained from real sensor data. This work aims to expand labeled on-body sensor data, by generating synthetic on-body sensor data from video, which will equip the community with methods to transfer labels from video to on-body sensors.\",\"PeriodicalId\":7183,\"journal\":{\"name\":\"Adjunct Proceedings of the 2020 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2020 ACM International Symposium on Wearable Computers\",\"volume\":\"16 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-09-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"15\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Adjunct Proceedings of the 2020 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2020 ACM International Symposium on Wearable Computers\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3410530.3414329\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Adjunct Proceedings of the 2020 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2020 ACM International Symposium on Wearable Computers","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3410530.3414329","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Deep generative cross-modal on-body accelerometer data synthesis from videos
Human activity recognition (HAR) based on wearable sensors has brought tremendous benefit to several industries ranging from healthcare to entertainment. However, to build reliable machine-learned models from wearables, labeled on-body sensor datasets obtained from real-world settings are needed. It is often prohibitively expensive to obtain large-scale, labeled on-body sensor datasets from real-world deployments. The lack of labeled datasets is a major obstacle in the wearable sensor-based activity recognition community. To overcome this problem, I aim to develop two deep generative cross-modal architectures to synthesize accelerometer data streams from video data streams. In the proposed approach, a conditional generative adversarial network (cGAN) is first used to generate sensor data conditioned on video data. Then, a conditional variational autoencoder (cVAE)-cGAN is proposed to further improve representation of the data. The effectiveness and efficacy of the proposed methods will be evaluated through two popular applications in HAR: eating recognition and physical activity recognition. Extensive experiments will be conducted on public sensor-based activity recognition datasets by building models with synthetic data and comparing the models against those trained from real sensor data. This work aims to expand labeled on-body sensor data, by generating synthetic on-body sensor data from video, which will equip the community with methods to transfer labels from video to on-body sensors.