{"title":"Zero-Shot Learning for Gesture Recognition","authors":"Naveen Madapana","doi":"10.1145/3382507.3421161","DOIUrl":null,"url":null,"abstract":"Zero-Shot Learning (ZSL) is a new paradigm in machine learning that aims to recognize the classes that are not present in the training data. Hence, this paradigm is capable of comprehending the categories that were never seen before. While deep learning has pushed the limits of unseen object recognition, ZSL for temporal problems such as unfamiliar gesture recognition (referred to as ZSGL) remain unexplored. ZSGL has the potential to result in efficient human-machine interfaces that can recognize and understand the spontaneous and conversational gestures of humans. In this regard, the objective of this work is to conceptualize, model and develop a framework to tackle ZSGL problems. The first step in the pipeline is to develop a database of gesture attributes that are representative of a range of categories. Next, a deep architecture consisting of convolutional and recurrent layers is proposed to jointly optimize the semantic and classification losses. Lastly, rigorous experiments are performed to compare the proposed model with respect to existing ZSL models on CGD 2013 and MSRC-12 datasets. In our preliminary work, we identified a list of 64 discriminative attributes related to gestures' morphological characteristics. Our approach yields an unseen class accuracy of (41%) which outperforms the state-of-the-art approaches by a considerable margin. Future work involves the following: 1. Modifying the existing architecture in order to improve the ZSL accuracy, 2. Augmenting the database of attributes to incorporate semantic properties, 3. Addressing the issue of data imbalance which is inherent to ZSL problems, and 4. Expanding this research to other domains such as surgeme and action recognition.","PeriodicalId":402394,"journal":{"name":"Proceedings of the 2020 International Conference on Multimodal Interaction","volume":"73 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2020 International Conference on Multimodal Interaction","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3382507.3421161","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
Zero-Shot Learning (ZSL) is a new paradigm in machine learning that aims to recognize the classes that are not present in the training data. Hence, this paradigm is capable of comprehending the categories that were never seen before. While deep learning has pushed the limits of unseen object recognition, ZSL for temporal problems such as unfamiliar gesture recognition (referred to as ZSGL) remain unexplored. ZSGL has the potential to result in efficient human-machine interfaces that can recognize and understand the spontaneous and conversational gestures of humans. In this regard, the objective of this work is to conceptualize, model and develop a framework to tackle ZSGL problems. The first step in the pipeline is to develop a database of gesture attributes that are representative of a range of categories. Next, a deep architecture consisting of convolutional and recurrent layers is proposed to jointly optimize the semantic and classification losses. Lastly, rigorous experiments are performed to compare the proposed model with respect to existing ZSL models on CGD 2013 and MSRC-12 datasets. In our preliminary work, we identified a list of 64 discriminative attributes related to gestures' morphological characteristics. Our approach yields an unseen class accuracy of (41%) which outperforms the state-of-the-art approaches by a considerable margin. Future work involves the following: 1. Modifying the existing architecture in order to improve the ZSL accuracy, 2. Augmenting the database of attributes to incorporate semantic properties, 3. Addressing the issue of data imbalance which is inherent to ZSL problems, and 4. Expanding this research to other domains such as surgeme and action recognition.