{"title":"基于IMU传感器的野外人类活动识别的不变特征学习","authors":"Yujiao Hao, Boyu Wang, Rong Zheng","doi":"10.1145/3576842.3582390","DOIUrl":null,"url":null,"abstract":"Deep neural network models for IMU sensor-based human activity recognition (HAR) that are trained from controlled, well-curated datasets suffer from poor generalizability in practical deployments. However, data collected from naturalistic settings often contains significant label noise. In this work, we examine two in-the-wild HAR datasets and DivideMix, a state-of-the-art learning with noise labels (LNL) method to understand the extent and impacts of noisy labels in training data. Our empirical analysis reveals that the substantial domain gaps among diverse subjects cause LNL methods to violate a key underlying assumption, namely, neural networks tend to fit simpler (and thus clean) data in early training epochs. Motivated by the insights, we design VALERIAN, an invariant feature learning method for in-the-wild wearable sensor-based HAR. By training a multi-task model with separate task-specific layers for each subject, VALERIAN allows noisy labels to be dealt with individually while benefiting from shared feature representation across subjects. We evaluated VALERIAN on four datasets, two collected in a controlled environment and two in the wild. Experimental results show that VALERIAN significantly outperforms baseline approaches. VALERIAN can correct 75% – 93% of label errors in the source domains. When only 10-second clean labeled data per class is available from a new target subject, even with 40% label noise in training data, it achieves test accuracy. Code is available at: https://github.com/YujiaoHao/VALERIAN.git","PeriodicalId":266438,"journal":{"name":"Proceedings of the 8th ACM/IEEE Conference on Internet of Things Design and Implementation","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"VALERIAN: Invariant Feature Learning for IMU Sensor-based Human Activity Recognition in the Wild\",\"authors\":\"Yujiao Hao, Boyu Wang, Rong Zheng\",\"doi\":\"10.1145/3576842.3582390\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep neural network models for IMU sensor-based human activity recognition (HAR) that are trained from controlled, well-curated datasets suffer from poor generalizability in practical deployments. However, data collected from naturalistic settings often contains significant label noise. In this work, we examine two in-the-wild HAR datasets and DivideMix, a state-of-the-art learning with noise labels (LNL) method to understand the extent and impacts of noisy labels in training data. Our empirical analysis reveals that the substantial domain gaps among diverse subjects cause LNL methods to violate a key underlying assumption, namely, neural networks tend to fit simpler (and thus clean) data in early training epochs. Motivated by the insights, we design VALERIAN, an invariant feature learning method for in-the-wild wearable sensor-based HAR. By training a multi-task model with separate task-specific layers for each subject, VALERIAN allows noisy labels to be dealt with individually while benefiting from shared feature representation across subjects. We evaluated VALERIAN on four datasets, two collected in a controlled environment and two in the wild. Experimental results show that VALERIAN significantly outperforms baseline approaches. VALERIAN can correct 75% – 93% of label errors in the source domains. When only 10-second clean labeled data per class is available from a new target subject, even with 40% label noise in training data, it achieves test accuracy. Code is available at: https://github.com/YujiaoHao/VALERIAN.git\",\"PeriodicalId\":266438,\"journal\":{\"name\":\"Proceedings of the 8th ACM/IEEE Conference on Internet of Things Design and Implementation\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-03-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 8th ACM/IEEE Conference on Internet of Things Design and Implementation\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3576842.3582390\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 8th ACM/IEEE Conference on Internet of Things Design and Implementation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3576842.3582390","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
VALERIAN: Invariant Feature Learning for IMU Sensor-based Human Activity Recognition in the Wild
Deep neural network models for IMU sensor-based human activity recognition (HAR) that are trained from controlled, well-curated datasets suffer from poor generalizability in practical deployments. However, data collected from naturalistic settings often contains significant label noise. In this work, we examine two in-the-wild HAR datasets and DivideMix, a state-of-the-art learning with noise labels (LNL) method to understand the extent and impacts of noisy labels in training data. Our empirical analysis reveals that the substantial domain gaps among diverse subjects cause LNL methods to violate a key underlying assumption, namely, neural networks tend to fit simpler (and thus clean) data in early training epochs. Motivated by the insights, we design VALERIAN, an invariant feature learning method for in-the-wild wearable sensor-based HAR. By training a multi-task model with separate task-specific layers for each subject, VALERIAN allows noisy labels to be dealt with individually while benefiting from shared feature representation across subjects. We evaluated VALERIAN on four datasets, two collected in a controlled environment and two in the wild. Experimental results show that VALERIAN significantly outperforms baseline approaches. VALERIAN can correct 75% – 93% of label errors in the source domains. When only 10-second clean labeled data per class is available from a new target subject, even with 40% label noise in training data, it achieves test accuracy. Code is available at: https://github.com/YujiaoHao/VALERIAN.git