Anuja P. Parameshwaran, Heta P. Desai, M. Weeks, Rajshekhar Sunderraman
{"title":"基于有限数据的Bharatanatyam手印分类的卷积神经网络解译","authors":"Anuja P. Parameshwaran, Heta P. Desai, M. Weeks, Rajshekhar Sunderraman","doi":"10.1109/CCWC47524.2020.9031185","DOIUrl":null,"url":null,"abstract":"Non-verbal forms of communication are universal, being free of any language barrier and widely used in all art forms. For example, in Bharatanatyam, an ancient Indian dance form, artists use different hand gestures, body postures and facial expressions to convey the story line. As identification and classification of these complex and multivariant visual images are difficult, it is now being addressed with the help of advanced computer vision techniques and deep neural networks. This work deals with studies in automation of identification, classification and labelling of selected Bharatnatyam gestures, as part of our efforts to preserve this rich cultural heritage for future generations. The classification of the mudras against their true labels was carried out using different singular pre-trained / non-pre-trained as well as stacked ensemble convolutional neural architectures (CNNs). In all, twenty-seven classes of asamyukta hasta (single hand gestures) data were collected from Google, YouTube and few real time performances by artists. Since the background in many frames are highly diverse, the acquired data is real and dynamic, compared to images from closed laboratory settings. The cleansing of mislabeled data from the dataset was done through label transferring based on distance-based similarity metric using convolutional siamese neural network. The classification of mudras was done using different CNN architecture: i) singular models, ii) ensemble models, and iii) few specialized models. This study achieved an accuracy of >95%, both in single and double transfer learning models, as well as their stacked ensemble model. The results emphasize the crucial role of domain similarity of the pre-training / training datasets for improved classification accuracy and, also indicate that doubly pre-trained CNN model yield the highest accuracy.","PeriodicalId":161209,"journal":{"name":"2020 10th Annual Computing and Communication Workshop and Conference (CCWC)","volume":"222 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Unravelling of Convolutional Neural Networks through Bharatanatyam Mudra Classification with Limited Data\",\"authors\":\"Anuja P. Parameshwaran, Heta P. Desai, M. Weeks, Rajshekhar Sunderraman\",\"doi\":\"10.1109/CCWC47524.2020.9031185\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Non-verbal forms of communication are universal, being free of any language barrier and widely used in all art forms. For example, in Bharatanatyam, an ancient Indian dance form, artists use different hand gestures, body postures and facial expressions to convey the story line. As identification and classification of these complex and multivariant visual images are difficult, it is now being addressed with the help of advanced computer vision techniques and deep neural networks. This work deals with studies in automation of identification, classification and labelling of selected Bharatnatyam gestures, as part of our efforts to preserve this rich cultural heritage for future generations. The classification of the mudras against their true labels was carried out using different singular pre-trained / non-pre-trained as well as stacked ensemble convolutional neural architectures (CNNs). In all, twenty-seven classes of asamyukta hasta (single hand gestures) data were collected from Google, YouTube and few real time performances by artists. Since the background in many frames are highly diverse, the acquired data is real and dynamic, compared to images from closed laboratory settings. The cleansing of mislabeled data from the dataset was done through label transferring based on distance-based similarity metric using convolutional siamese neural network. The classification of mudras was done using different CNN architecture: i) singular models, ii) ensemble models, and iii) few specialized models. This study achieved an accuracy of >95%, both in single and double transfer learning models, as well as their stacked ensemble model. The results emphasize the crucial role of domain similarity of the pre-training / training datasets for improved classification accuracy and, also indicate that doubly pre-trained CNN model yield the highest accuracy.\",\"PeriodicalId\":161209,\"journal\":{\"name\":\"2020 10th Annual Computing and Communication Workshop and Conference (CCWC)\",\"volume\":\"222 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 10th Annual Computing and Communication Workshop and Conference (CCWC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CCWC47524.2020.9031185\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 10th Annual Computing and Communication Workshop and Conference (CCWC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCWC47524.2020.9031185","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Unravelling of Convolutional Neural Networks through Bharatanatyam Mudra Classification with Limited Data
Non-verbal forms of communication are universal, being free of any language barrier and widely used in all art forms. For example, in Bharatanatyam, an ancient Indian dance form, artists use different hand gestures, body postures and facial expressions to convey the story line. As identification and classification of these complex and multivariant visual images are difficult, it is now being addressed with the help of advanced computer vision techniques and deep neural networks. This work deals with studies in automation of identification, classification and labelling of selected Bharatnatyam gestures, as part of our efforts to preserve this rich cultural heritage for future generations. The classification of the mudras against their true labels was carried out using different singular pre-trained / non-pre-trained as well as stacked ensemble convolutional neural architectures (CNNs). In all, twenty-seven classes of asamyukta hasta (single hand gestures) data were collected from Google, YouTube and few real time performances by artists. Since the background in many frames are highly diverse, the acquired data is real and dynamic, compared to images from closed laboratory settings. The cleansing of mislabeled data from the dataset was done through label transferring based on distance-based similarity metric using convolutional siamese neural network. The classification of mudras was done using different CNN architecture: i) singular models, ii) ensemble models, and iii) few specialized models. This study achieved an accuracy of >95%, both in single and double transfer learning models, as well as their stacked ensemble model. The results emphasize the crucial role of domain similarity of the pre-training / training datasets for improved classification accuracy and, also indicate that doubly pre-trained CNN model yield the highest accuracy.