A Comparative Study of Deep Transfer Learning Techniques for Cultural (Aeta) Dance Classification utilizing Skeleton-Based Choreographic Motion Capture Data

2021 International Conference on Computational Intelligence and Knowledge Economy (ICCIKE) Pub Date : 2021-03-17 DOI:10.1109/ICCIKE51210.2021.9410796

Jennalyn N. Mindoro, E. Festijo, M. T. D. de Guzman

{"title":"A Comparative Study of Deep Transfer Learning Techniques for Cultural (Aeta) Dance Classification utilizing Skeleton-Based Choreographic Motion Capture Data","authors":"Jennalyn N. Mindoro, E. Festijo, M. T. D. de Guzman","doi":"10.1109/ICCIKE51210.2021.9410796","DOIUrl":null,"url":null,"abstract":"The advancement of motion-sensing technology and depth cameras has led to a vast opportunity in motion analysis and monitoring applications, kinesiology analysis, and safeguarding intangible cultural heritage (ICH). A technique that allows a computer to understand human behavior is necessary to analyze and identify the motion using the motion capture data. The integration of motion sensing technology such as markerless motion capture devices, inertial sensors, and deep learning techniques gives an innovative approach to recording, analyzing, and visually recognizing human choreographic motion. Convolutional Neural Network (CNN) is one of the best-known techniques in learning patterns from images and videos and is most widely used among deep learning architectures for vision applications. This study explored different CNN architecture to determine the best prediction classifier based on its performances, such as VGG19, InceptionV3, and MobileNetV2. This study aims to perform an image classification approach of one of the Philippines’ cultural dances, Aeta dances, utilizing skeleton-based motion capture data using CNN. The test results were assessed based on the generated training accuracy and evaluation of the loss function to assess the models’ overall efficiency. VGG19 produced the highest model cultural dance classification accuracy among the three architectures, which resulted in 98.68% compared to InceptionV3 and MobileNetV2. Thus, the VGG19 model illustrates the optimal transfer learning result implies the best fit model than InceptionV3 and MobileNetV2.","PeriodicalId":254711,"journal":{"name":"2021 International Conference on Computational Intelligence and Knowledge Economy (ICCIKE)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Computational Intelligence and Knowledge Economy (ICCIKE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCIKE51210.2021.9410796","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

The advancement of motion-sensing technology and depth cameras has led to a vast opportunity in motion analysis and monitoring applications, kinesiology analysis, and safeguarding intangible cultural heritage (ICH). A technique that allows a computer to understand human behavior is necessary to analyze and identify the motion using the motion capture data. The integration of motion sensing technology such as markerless motion capture devices, inertial sensors, and deep learning techniques gives an innovative approach to recording, analyzing, and visually recognizing human choreographic motion. Convolutional Neural Network (CNN) is one of the best-known techniques in learning patterns from images and videos and is most widely used among deep learning architectures for vision applications. This study explored different CNN architecture to determine the best prediction classifier based on its performances, such as VGG19, InceptionV3, and MobileNetV2. This study aims to perform an image classification approach of one of the Philippines’ cultural dances, Aeta dances, utilizing skeleton-based motion capture data using CNN. The test results were assessed based on the generated training accuracy and evaluation of the loss function to assess the models’ overall efficiency. VGG19 produced the highest model cultural dance classification accuracy among the three architectures, which resulted in 98.68% compared to InceptionV3 and MobileNetV2. Thus, the VGG19 model illustrates the optimal transfer learning result implies the best fit model than InceptionV3 and MobileNetV2.

查看原文本刊更多论文

利用基于骨骼的舞蹈动作捕捉数据进行文化(Aeta)舞蹈分类的深度迁移学习技术的比较研究

运动传感技术和深度相机的进步为运动分析和监测应用、运动机能学分析和保护非物质文化遗产(ICH)带来了巨大的机会。一种允许计算机理解人类行为的技术对于使用动作捕捉数据分析和识别动作是必要的。无标记动作捕捉设备、惯性传感器和深度学习技术等运动传感技术的集成为记录、分析和视觉识别人类舞蹈动作提供了一种创新的方法。卷积神经网络(CNN)是最著名的从图像和视频中学习模式的技术之一，在视觉应用的深度学习架构中应用最广泛。本研究探索了不同的CNN架构，根据其性能来确定最佳的预测分类器，如VGG19、InceptionV3和MobileNetV2。本研究旨在利用CNN的基于骨骼的动作捕捉数据，对菲律宾的一种文化舞蹈Aeta舞蹈进行图像分类。根据生成的训练精度和损失函数的评估来评估测试结果，以评估模型的整体效率。VGG19在三个架构中产生了最高的模型文化舞蹈分类准确率，与InceptionV3和MobileNetV2相比达到了98.68%。因此，VGG19模型说明了最优迁移学习结果意味着比InceptionV3和MobileNetV2更适合的模型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 International Conference on Computational Intelligence and Knowledge Economy (ICCIKE)

自引率

0.00%

发文量