用文本提示识别基于骨骼的人类动作

Lin Yuan, Zhen He, Qianqian Wang, Leiyang Xu, Xiang Ma
{"title":"用文本提示识别基于骨骼的人类动作","authors":"Lin Yuan, Zhen He, Qianqian Wang, Leiyang Xu, Xiang Ma","doi":"10.1109/ICSAI57119.2022.10005459","DOIUrl":null,"url":null,"abstract":"Human action recognition has been a hot research for decades, and mainstream supervised frameworks include a feature extraction backbone and a softmax classifier to predict daily human actions. When the number of classes applied to the dataset changes, we must retrain the classifier on the well-trained backbone. This pipeline restricts the generalization and transfer ability of the model due to an extra training period. Moreover, replacing action labels with simple number labels discards useful semantic information and can only receive a meaningless classifier at last. In this work, we present a model SkeletonCLIP for skeleton-based human action recognition. We add an alternative text encoder to extract semantic information from labels while keeping the original sequence encoder. We use dot production to measure the similarities of sequence-text pairs in place of traditional classifier head and cross-entropy loss. Experiments from three human action datasets show that our framework can reach a higher recognition accuracy with the help of semantic information when training the network from scratch. The code has been shown at eunseo-v/SkeletonCLIP.","PeriodicalId":339547,"journal":{"name":"2022 8th International Conference on Systems and Informatics (ICSAI)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SkeletonCLIP: Recognizing Skeleton-based Human Actions with Text Prompts\",\"authors\":\"Lin Yuan, Zhen He, Qianqian Wang, Leiyang Xu, Xiang Ma\",\"doi\":\"10.1109/ICSAI57119.2022.10005459\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Human action recognition has been a hot research for decades, and mainstream supervised frameworks include a feature extraction backbone and a softmax classifier to predict daily human actions. When the number of classes applied to the dataset changes, we must retrain the classifier on the well-trained backbone. This pipeline restricts the generalization and transfer ability of the model due to an extra training period. Moreover, replacing action labels with simple number labels discards useful semantic information and can only receive a meaningless classifier at last. In this work, we present a model SkeletonCLIP for skeleton-based human action recognition. We add an alternative text encoder to extract semantic information from labels while keeping the original sequence encoder. We use dot production to measure the similarities of sequence-text pairs in place of traditional classifier head and cross-entropy loss. Experiments from three human action datasets show that our framework can reach a higher recognition accuracy with the help of semantic information when training the network from scratch. The code has been shown at eunseo-v/SkeletonCLIP.\",\"PeriodicalId\":339547,\"journal\":{\"name\":\"2022 8th International Conference on Systems and Informatics (ICSAI)\",\"volume\":\"14 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 8th International Conference on Systems and Informatics (ICSAI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSAI57119.2022.10005459\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 8th International Conference on Systems and Informatics (ICSAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSAI57119.2022.10005459","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

人类行为识别是几十年来的研究热点,主流的监督框架包括特征提取骨干和softmax分类器来预测人类的日常行为。当应用于数据集的类数量发生变化时,我们必须在训练良好的主干上重新训练分类器。由于额外的训练周期,这种管道限制了模型的泛化和迁移能力。而且,用简单的数字标签代替动作标签,丢弃了有用的语义信息,最后只能得到一个无意义的分类器。在这项工作中,我们提出了一个基于骨骼的人体动作识别模型骷髅clip。在保留原始序列编码器的同时,我们增加了一个替代文本编码器来从标签中提取语义信息。我们使用点产生来测量序列文本对的相似性,取代传统的分类器头和交叉熵损失。三个人体动作数据集的实验表明,我们的框架在从头开始训练网络时,借助语义信息可以达到更高的识别精度。代码已显示在eunseo-v/SkeletonCLIP。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
SkeletonCLIP: Recognizing Skeleton-based Human Actions with Text Prompts
Human action recognition has been a hot research for decades, and mainstream supervised frameworks include a feature extraction backbone and a softmax classifier to predict daily human actions. When the number of classes applied to the dataset changes, we must retrain the classifier on the well-trained backbone. This pipeline restricts the generalization and transfer ability of the model due to an extra training period. Moreover, replacing action labels with simple number labels discards useful semantic information and can only receive a meaningless classifier at last. In this work, we present a model SkeletonCLIP for skeleton-based human action recognition. We add an alternative text encoder to extract semantic information from labels while keeping the original sequence encoder. We use dot production to measure the similarities of sequence-text pairs in place of traditional classifier head and cross-entropy loss. Experiments from three human action datasets show that our framework can reach a higher recognition accuracy with the help of semantic information when training the network from scratch. The code has been shown at eunseo-v/SkeletonCLIP.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信