慢速与DropBlock和平滑的样本丢失学生动作识别

International Conference on Digital Image Processing Pub Date : 2022-10-12 DOI:10.1117/12.2644370

Chuanming Li, Wenxing Bao, Xu Chen, Yongjun Jing, Xiudong Qu

{"title":"慢速与DropBlock和平滑的样本丢失学生动作识别","authors":"Chuanming Li, Wenxing Bao, Xu Chen, Yongjun Jing, Xiudong Qu","doi":"10.1117/12.2644370","DOIUrl":null,"url":null,"abstract":"Due to the advent of large-scale video datasets, action recognition using three-dimensional convolutions (3D CNNs) containing spatiotemporal information has become mainstream. Aiming at the problem of classroom student behavior recognition, the paper adopts the improved SlowFast network structure to deal with spatial structure and temporal events respectively. First, DropBlock (a regularization method) is added to the SlowFast network to solve the overfitting problem. Second, for the problem of Long-Tailed Distribution, the designed Smooth Sample (SS) Loss function is added to the network to smooth the number of samples. Classification experiments show that compared with similar methods, the model accuracy of our method on the Kinetics and Student Action Dataset is increased by 2.1% and 2.9%, respectively.","PeriodicalId":314555,"journal":{"name":"International Conference on Digital Image Processing","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SlowFast with DropBlock and smooth samples loss for student action recognition\",\"authors\":\"Chuanming Li, Wenxing Bao, Xu Chen, Yongjun Jing, Xiudong Qu\",\"doi\":\"10.1117/12.2644370\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Due to the advent of large-scale video datasets, action recognition using three-dimensional convolutions (3D CNNs) containing spatiotemporal information has become mainstream. Aiming at the problem of classroom student behavior recognition, the paper adopts the improved SlowFast network structure to deal with spatial structure and temporal events respectively. First, DropBlock (a regularization method) is added to the SlowFast network to solve the overfitting problem. Second, for the problem of Long-Tailed Distribution, the designed Smooth Sample (SS) Loss function is added to the network to smooth the number of samples. Classification experiments show that compared with similar methods, the model accuracy of our method on the Kinetics and Student Action Dataset is increased by 2.1% and 2.9%, respectively.\",\"PeriodicalId\":314555,\"journal\":{\"name\":\"International Conference on Digital Image Processing\",\"volume\":\"3 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Conference on Digital Image Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1117/12.2644370\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Digital Image Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1117/12.2644370","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

随着大规模视频数据集的出现，使用包含时空信息的三维卷积(3D cnn)进行动作识别已成为主流。针对课堂学生行为识别问题，本文采用改进的SlowFast网络结构分别处理空间结构和时间事件。首先，在SlowFast网络中加入DropBlock(一种正则化方法)来解决过拟合问题。其次，针对长尾分布问题，在网络中加入设计好的平滑样本(SS)损失函数，使样本数量平滑。分类实验表明，与同类方法相比，本文方法在Kinetics和Student Action数据集上的模型准确率分别提高了2.1%和2.9%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

SlowFast with DropBlock and smooth samples loss for student action recognition

Due to the advent of large-scale video datasets, action recognition using three-dimensional convolutions (3D CNNs) containing spatiotemporal information has become mainstream. Aiming at the problem of classroom student behavior recognition, the paper adopts the improved SlowFast network structure to deal with spatial structure and temporal events respectively. First, DropBlock (a regularization method) is added to the SlowFast network to solve the overfitting problem. Second, for the problem of Long-Tailed Distribution, the designed Smooth Sample (SS) Loss function is added to the network to smooth the number of samples. Classification experiments show that compared with similar methods, the model accuracy of our method on the Kinetics and Student Action Dataset is increased by 2.1% and 2.9%, respectively.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Conference on Digital Image Processing

自引率

0.00%

发文量