A Comparative Study of Deep Transfer Learning Techniques for Cultural (Aeta) Dance Classification utilizing Skeleton-Based Choreographic Motion Capture Data
Jennalyn N. Mindoro, E. Festijo, M. T. D. de Guzman
{"title":"A Comparative Study of Deep Transfer Learning Techniques for Cultural (Aeta) Dance Classification utilizing Skeleton-Based Choreographic Motion Capture Data","authors":"Jennalyn N. Mindoro, E. Festijo, M. T. D. de Guzman","doi":"10.1109/ICCIKE51210.2021.9410796","DOIUrl":null,"url":null,"abstract":"The advancement of motion-sensing technology and depth cameras has led to a vast opportunity in motion analysis and monitoring applications, kinesiology analysis, and safeguarding intangible cultural heritage (ICH). A technique that allows a computer to understand human behavior is necessary to analyze and identify the motion using the motion capture data. The integration of motion sensing technology such as markerless motion capture devices, inertial sensors, and deep learning techniques gives an innovative approach to recording, analyzing, and visually recognizing human choreographic motion. Convolutional Neural Network (CNN) is one of the best-known techniques in learning patterns from images and videos and is most widely used among deep learning architectures for vision applications. This study explored different CNN architecture to determine the best prediction classifier based on its performances, such as VGG19, InceptionV3, and MobileNetV2. This study aims to perform an image classification approach of one of the Philippines’ cultural dances, Aeta dances, utilizing skeleton-based motion capture data using CNN. The test results were assessed based on the generated training accuracy and evaluation of the loss function to assess the models’ overall efficiency. VGG19 produced the highest model cultural dance classification accuracy among the three architectures, which resulted in 98.68% compared to InceptionV3 and MobileNetV2. Thus, the VGG19 model illustrates the optimal transfer learning result implies the best fit model than InceptionV3 and MobileNetV2.","PeriodicalId":254711,"journal":{"name":"2021 International Conference on Computational Intelligence and Knowledge Economy (ICCIKE)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Computational Intelligence and Knowledge Economy (ICCIKE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCIKE51210.2021.9410796","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
The advancement of motion-sensing technology and depth cameras has led to a vast opportunity in motion analysis and monitoring applications, kinesiology analysis, and safeguarding intangible cultural heritage (ICH). A technique that allows a computer to understand human behavior is necessary to analyze and identify the motion using the motion capture data. The integration of motion sensing technology such as markerless motion capture devices, inertial sensors, and deep learning techniques gives an innovative approach to recording, analyzing, and visually recognizing human choreographic motion. Convolutional Neural Network (CNN) is one of the best-known techniques in learning patterns from images and videos and is most widely used among deep learning architectures for vision applications. This study explored different CNN architecture to determine the best prediction classifier based on its performances, such as VGG19, InceptionV3, and MobileNetV2. This study aims to perform an image classification approach of one of the Philippines’ cultural dances, Aeta dances, utilizing skeleton-based motion capture data using CNN. The test results were assessed based on the generated training accuracy and evaluation of the loss function to assess the models’ overall efficiency. VGG19 produced the highest model cultural dance classification accuracy among the three architectures, which resulted in 98.68% compared to InceptionV3 and MobileNetV2. Thus, the VGG19 model illustrates the optimal transfer learning result implies the best fit model than InceptionV3 and MobileNetV2.