Weiming Zhang, Yi Huang, Wanting Yu, Xiaoshan Yang, Wei Wang, J. Sang
{"title":"活动识别的多模态属性和特征嵌入","authors":"Weiming Zhang, Yi Huang, Wanting Yu, Xiaoshan Yang, Wei Wang, J. Sang","doi":"10.1145/3338533.3366592","DOIUrl":null,"url":null,"abstract":"Human Activity Recognition (HAR) automatically recognizes human activities such as daily life and work based on digital records, which is of great significance to medical and health fields. Egocentric video and human acceleration data comprehensively describe human activity patterns from different aspects, which have laid a foundation for activity recognition based on multimodal behavior data. However, on the one hand, the low-level multimodal signal structures differ greatly and the mapping to high-level activities is complicated. On the other hand, the activity labeling based on multimodal behavior data has high cost and limited data amount, which limits the technical development in this field. In this paper, an activity recognition model MAFE based on multimodal attribute feature embedding is proposed. Before the activity recognition, the middle-level attribute features are extracted from the low-level signals of different modes. On the one hand, the mapping complexity from the low-level signals to the high-level activities is reduced, and on the other hand, a large number of middle-level attribute labeling data can be used to reduce the dependency on the activity labeling data. We conducted experiments on Stanford-ECM datasets to verify the effectiveness of the proposed MAFE method.","PeriodicalId":273086,"journal":{"name":"Proceedings of the ACM Multimedia Asia","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Multimodal Attribute and Feature Embedding for Activity Recognition\",\"authors\":\"Weiming Zhang, Yi Huang, Wanting Yu, Xiaoshan Yang, Wei Wang, J. Sang\",\"doi\":\"10.1145/3338533.3366592\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Human Activity Recognition (HAR) automatically recognizes human activities such as daily life and work based on digital records, which is of great significance to medical and health fields. Egocentric video and human acceleration data comprehensively describe human activity patterns from different aspects, which have laid a foundation for activity recognition based on multimodal behavior data. However, on the one hand, the low-level multimodal signal structures differ greatly and the mapping to high-level activities is complicated. On the other hand, the activity labeling based on multimodal behavior data has high cost and limited data amount, which limits the technical development in this field. In this paper, an activity recognition model MAFE based on multimodal attribute feature embedding is proposed. Before the activity recognition, the middle-level attribute features are extracted from the low-level signals of different modes. On the one hand, the mapping complexity from the low-level signals to the high-level activities is reduced, and on the other hand, a large number of middle-level attribute labeling data can be used to reduce the dependency on the activity labeling data. We conducted experiments on Stanford-ECM datasets to verify the effectiveness of the proposed MAFE method.\",\"PeriodicalId\":273086,\"journal\":{\"name\":\"Proceedings of the ACM Multimedia Asia\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-12-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the ACM Multimedia Asia\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3338533.3366592\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ACM Multimedia Asia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3338533.3366592","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Multimodal Attribute and Feature Embedding for Activity Recognition
Human Activity Recognition (HAR) automatically recognizes human activities such as daily life and work based on digital records, which is of great significance to medical and health fields. Egocentric video and human acceleration data comprehensively describe human activity patterns from different aspects, which have laid a foundation for activity recognition based on multimodal behavior data. However, on the one hand, the low-level multimodal signal structures differ greatly and the mapping to high-level activities is complicated. On the other hand, the activity labeling based on multimodal behavior data has high cost and limited data amount, which limits the technical development in this field. In this paper, an activity recognition model MAFE based on multimodal attribute feature embedding is proposed. Before the activity recognition, the middle-level attribute features are extracted from the low-level signals of different modes. On the one hand, the mapping complexity from the low-level signals to the high-level activities is reduced, and on the other hand, a large number of middle-level attribute labeling data can be used to reduce the dependency on the activity labeling data. We conducted experiments on Stanford-ECM datasets to verify the effectiveness of the proposed MAFE method.