{"title":"Spatial-Temporal Masked Autoencoder for Multi-Device Wearable Human Activity Recognition","authors":"Shenghuan Miao, Ling Chen, Rong Hu","doi":"10.1145/3631415","DOIUrl":null,"url":null,"abstract":"The widespread adoption of wearable devices has led to a surge in the development of multi-device wearable human activity recognition (WHAR) systems. Nevertheless, the performance of traditional supervised learning-based methods to WHAR is limited by the challenge of collecting ample annotated wearable data. To overcome this limitation, self-supervised learning (SSL) has emerged as a promising solution by first training a competent feature extractor on a substantial quantity of unlabeled data, followed by refining a minimal classifier with a small amount of labeled data. Despite the promise of SSL in WHAR, the majority of studies have not considered missing device scenarios in multi-device WHAR. To bridge this gap, we propose a multi-device SSL WHAR method termed Spatial-Temporal Masked Autoencoder (STMAE). STMAE captures discriminative activity representations by utilizing the asymmetrical encoder-decoder structure and two-stage spatial-temporal masking strategy, which can exploit the spatial-temporal correlations in multi-device data to improve the performance of SSL WHAR, especially on missing device scenarios. Experiments on four real-world datasets demonstrate the efficacy of STMAE in various practical scenarios.","PeriodicalId":20553,"journal":{"name":"Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies","volume":"3 3","pages":"1 - 25"},"PeriodicalIF":3.6000,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3631415","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
The widespread adoption of wearable devices has led to a surge in the development of multi-device wearable human activity recognition (WHAR) systems. Nevertheless, the performance of traditional supervised learning-based methods to WHAR is limited by the challenge of collecting ample annotated wearable data. To overcome this limitation, self-supervised learning (SSL) has emerged as a promising solution by first training a competent feature extractor on a substantial quantity of unlabeled data, followed by refining a minimal classifier with a small amount of labeled data. Despite the promise of SSL in WHAR, the majority of studies have not considered missing device scenarios in multi-device WHAR. To bridge this gap, we propose a multi-device SSL WHAR method termed Spatial-Temporal Masked Autoencoder (STMAE). STMAE captures discriminative activity representations by utilizing the asymmetrical encoder-decoder structure and two-stage spatial-temporal masking strategy, which can exploit the spatial-temporal correlations in multi-device data to improve the performance of SSL WHAR, especially on missing device scenarios. Experiments on four real-world datasets demonstrate the efficacy of STMAE in various practical scenarios.