LASO: Exploiting Locomotive and Acoustic Signatures over the Edge to Annotate IMU Data for Human Activity Recognition

Proceedings of the 2020 International Conference on Multimodal Interaction Pub Date : 2020-10-21 DOI:10.1145/3382507.3418826

S. Chatterjee, Avijoy Chakma, A. Gangopadhyay, Nirmalya Roy, Bivas Mitra, Sandip Chakraborty

{"title":"LASO: Exploiting Locomotive and Acoustic Signatures over the Edge to Annotate IMU Data for Human Activity Recognition","authors":"S. Chatterjee, Avijoy Chakma, A. Gangopadhyay, Nirmalya Roy, Bivas Mitra, Sandip Chakraborty","doi":"10.1145/3382507.3418826","DOIUrl":null,"url":null,"abstract":"Annotated IMU sensor data from smart devices and wearables are essential for developing supervised models for fine-grained human activity recognition, albeit generating sufficient annotated data for diverse human activities under different environments is challenging. Existing approaches primarily use human-in-the-loop based techniques, including active learning; however, they are tedious, costly, and time-consuming. Leveraging the availability of acoustic data from embedded microphones over the data collection devices, in this paper, we propose LASO, a multimodal approach for automated data annotation from acoustic and locomotive information. LASO works over the edge device itself, ensuring that only the annotated IMU data is collected, discarding the acoustic data from the device itself, hence preserving the audio-privacy of the user. In the absence of any pre-existing labeling information, such an auto-annotation is challenging as the IMU data needs to be sessionized for different time-scaled activities in a completely unsupervised manner. We use a change-point detection technique while synchronizing the locomotive information from the IMU data with the acoustic data, and then use pre-trained audio-based activity recognition models for labeling the IMU data while handling the acoustic noises. LASO efficiently annotates IMU data, without any explicit human intervention, with a mean accuracy of $0.93$ ($\\pm 0.04$) and $0.78$ ($\\pm 0.05$) for two different real-life datasets from workshop and kitchen environments, respectively.","PeriodicalId":402394,"journal":{"name":"Proceedings of the 2020 International Conference on Multimodal Interaction","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2020 International Conference on Multimodal Interaction","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3382507.3418826","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 8

Abstract

Annotated IMU sensor data from smart devices and wearables are essential for developing supervised models for fine-grained human activity recognition, albeit generating sufficient annotated data for diverse human activities under different environments is challenging. Existing approaches primarily use human-in-the-loop based techniques, including active learning; however, they are tedious, costly, and time-consuming. Leveraging the availability of acoustic data from embedded microphones over the data collection devices, in this paper, we propose LASO, a multimodal approach for automated data annotation from acoustic and locomotive information. LASO works over the edge device itself, ensuring that only the annotated IMU data is collected, discarding the acoustic data from the device itself, hence preserving the audio-privacy of the user. In the absence of any pre-existing labeling information, such an auto-annotation is challenging as the IMU data needs to be sessionized for different time-scaled activities in a completely unsupervised manner. We use a change-point detection technique while synchronizing the locomotive information from the IMU data with the acoustic data, and then use pre-trained audio-based activity recognition models for labeling the IMU data while handling the acoustic noises. LASO efficiently annotates IMU data, without any explicit human intervention, with a mean accuracy of $0.93$ ($\pm 0.04$) and $0.78$ ($\pm 0.05$) for two different real-life datasets from workshop and kitchen environments, respectively.

查看原文本刊更多论文

LASO:利用机车和声学特征在边缘上标注IMU数据用于人类活动识别

来自智能设备和可穿戴设备的带注释的IMU传感器数据对于开发用于细粒度人类活动识别的监督模型至关重要，尽管为不同环境下的各种人类活动生成足够的带注释数据是一项挑战。现有的方法主要使用基于人在环的技术，包括主动学习;然而，它们乏味、昂贵且耗时。利用数据收集设备上嵌入式麦克风的声学数据的可用性，在本文中，我们提出了LASO，一种从声学和机车信息中自动注释数据的多模式方法。LASO在边缘设备上工作，确保只收集带注释的IMU数据，丢弃设备本身的声学数据，从而保护用户的音频隐私。在没有任何预先存在的标记信息的情况下，这种自动注释是具有挑战性的，因为IMU数据需要以完全无监督的方式对不同的时间尺度活动进行会话化。我们使用变点检测技术将IMU数据中的机车信息与声学数据同步，然后使用预训练的基于音频的活动识别模型对IMU数据进行标记，同时处理声学噪声。LASO有效地注释了IMU数据，没有任何显式的人为干预，对于来自车间和厨房环境的两个不同的真实数据集，平均精度分别为0.93美元($\pm 0.04美元)和0.78美元($\pm 0.05美元)。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 2020 International Conference on Multimodal Interaction

自引率

0.00%

发文量