{"title":"Comparison of CNN-Based Methods for Yoga Pose Classification","authors":"Vildan ATALAY AYDIN","doi":"10.31127/tuje.1275826","DOIUrl":null,"url":null,"abstract":"Yoga is an exercise developed in ancient India. People perform yoga in order to have mental, physical, and spiritual benefits. While yoga helps build strength in the mind and body, incorrect postures might result in serious injuries. Therefore, yoga exercisers need either an expert or a platform to receive feedback on their performance. Since access to experts is not an option for everyone, a system to provide feedback on the yoga poses is required. To this end, commercial products such as smart yoga mats and smart pants are produced; Kinect cameras, sensors, and wearable devices are used. However, these solutions are either uncomfortable to wear or not affordable for everyone. Nonetheless, a system that employs computer vision techniques is a requirement. In this paper, we propose a deep-learning model for yoga pose classification, which is the first step of a quality assessment system. We introduce a wavelet-based model that first takes wavelet transform of input images. The acquired subbands, i.e., approximation, horizontal, vertical, and diagonal coefficients of the wavelet transform are then fed into separate convolutional neural networks (CNN). The obtained probability results for each group are fused in order to have the final yoga class prediction. A publicly available dataset with 5 yoga poses is used. Since the number of images in the dataset is not enough for a deep learning model, we also perform data augmentation to increase the number of images. We compare our results to a CNN model and the three models that employ the subbands separately. Results obtained using the proposed model outperforms the accuracy output achieved with the compared models.","PeriodicalId":23377,"journal":{"name":"Turkish Journal of Engineering and Environmental Sciences","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Turkish Journal of Engineering and Environmental Sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.31127/tuje.1275826","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Yoga is an exercise developed in ancient India. People perform yoga in order to have mental, physical, and spiritual benefits. While yoga helps build strength in the mind and body, incorrect postures might result in serious injuries. Therefore, yoga exercisers need either an expert or a platform to receive feedback on their performance. Since access to experts is not an option for everyone, a system to provide feedback on the yoga poses is required. To this end, commercial products such as smart yoga mats and smart pants are produced; Kinect cameras, sensors, and wearable devices are used. However, these solutions are either uncomfortable to wear or not affordable for everyone. Nonetheless, a system that employs computer vision techniques is a requirement. In this paper, we propose a deep-learning model for yoga pose classification, which is the first step of a quality assessment system. We introduce a wavelet-based model that first takes wavelet transform of input images. The acquired subbands, i.e., approximation, horizontal, vertical, and diagonal coefficients of the wavelet transform are then fed into separate convolutional neural networks (CNN). The obtained probability results for each group are fused in order to have the final yoga class prediction. A publicly available dataset with 5 yoga poses is used. Since the number of images in the dataset is not enough for a deep learning model, we also perform data augmentation to increase the number of images. We compare our results to a CNN model and the three models that employ the subbands separately. Results obtained using the proposed model outperforms the accuracy output achieved with the compared models.