Research on Classroom Interaction Behavior Analysis Algorithm based on Audio and Video

Zhiwei Zheng, Yuting Huang
{"title":"Research on Classroom Interaction Behavior Analysis Algorithm based on Audio and Video","authors":"Zhiwei Zheng, Yuting Huang","doi":"10.1145/3561613.3561633","DOIUrl":null,"url":null,"abstract":"Classroom interaction behavior research is an important part of classroom teaching quality evaluation, which can effectively improve teaching quality. Traditional classroom interaction behavior research is mainly carried out in the form of expert lectures and student questionnaires. This method can neither make the best use of the large amount of data generated in the classroom scene, nor can it provide an objective and detailed evaluation of the teaching quality. However, in the context of educational informatization, using information technology to observe and analyze classroom interaction can make full use of teaching data and provide timely and objective feedback on the teaching situation. This paper focuses on the analysis of classroom interaction behavior in colleges and universities. In order to make full use of classroom audio and video data, a framework for classroom interaction behavior analysis based on audio and video is constructed. It divides classroom interaction behaviors into verbal and non-verbal categories, and uses deep learning technology to realize automated classroom interaction analysis. The main work and innovations are as follows: (1) Combined with the theoretical basis of traditional classroom interaction analysis and the requirements of efficient classrooms for classroom quality evaluation, this paper constructs an audio-video-based classroom interaction behavior analysis framework. (2) The speaker segmentation and clustering algorithm in the verbal classroom interaction behavior analysis task is improved, and a frame feature extraction network integrating LSTM and TDNN and a temporal pooling network based on the dual multi-head attention mechanism are proposed. Compared with the DIHARD III baseline network, the improved speaker segmentation clustering algorithm reduces the speaker separation error rate (DER) by 3.24%, 3.19%, 4.53% and 4.14%, respectively, on the four types of evaluation datasets. (3) For the face detection algorithm in the non-verbal classroom interactive behavior analysis task, a single-stage face detection network FDN is proposed, and a bidirectional feature fusion module FPN+PANet, a prediction branch IoU- aware and a loss function CIoU are designed. Compared with RetinaFace, the final FDN has the most obvious improvement, and the average precision (Average Precision, AP) on the verification and test set difficult targets has increased by 2.6% and 2.7%, respectively.","PeriodicalId":348024,"journal":{"name":"Proceedings of the 5th International Conference on Control and Computer Vision","volume":"22 11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 5th International Conference on Control and Computer Vision","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3561613.3561633","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Classroom interaction behavior research is an important part of classroom teaching quality evaluation, which can effectively improve teaching quality. Traditional classroom interaction behavior research is mainly carried out in the form of expert lectures and student questionnaires. This method can neither make the best use of the large amount of data generated in the classroom scene, nor can it provide an objective and detailed evaluation of the teaching quality. However, in the context of educational informatization, using information technology to observe and analyze classroom interaction can make full use of teaching data and provide timely and objective feedback on the teaching situation. This paper focuses on the analysis of classroom interaction behavior in colleges and universities. In order to make full use of classroom audio and video data, a framework for classroom interaction behavior analysis based on audio and video is constructed. It divides classroom interaction behaviors into verbal and non-verbal categories, and uses deep learning technology to realize automated classroom interaction analysis. The main work and innovations are as follows: (1) Combined with the theoretical basis of traditional classroom interaction analysis and the requirements of efficient classrooms for classroom quality evaluation, this paper constructs an audio-video-based classroom interaction behavior analysis framework. (2) The speaker segmentation and clustering algorithm in the verbal classroom interaction behavior analysis task is improved, and a frame feature extraction network integrating LSTM and TDNN and a temporal pooling network based on the dual multi-head attention mechanism are proposed. Compared with the DIHARD III baseline network, the improved speaker segmentation clustering algorithm reduces the speaker separation error rate (DER) by 3.24%, 3.19%, 4.53% and 4.14%, respectively, on the four types of evaluation datasets. (3) For the face detection algorithm in the non-verbal classroom interactive behavior analysis task, a single-stage face detection network FDN is proposed, and a bidirectional feature fusion module FPN+PANet, a prediction branch IoU- aware and a loss function CIoU are designed. Compared with RetinaFace, the final FDN has the most obvious improvement, and the average precision (Average Precision, AP) on the verification and test set difficult targets has increased by 2.6% and 2.7%, respectively.
基于音视频的课堂互动行为分析算法研究
课堂互动行为研究是课堂教学质量评价的重要组成部分,可以有效提高课堂教学质量。传统的课堂互动行为研究主要以专家讲座和学生问卷调查的形式进行。这种方法既不能充分利用课堂场景中产生的大量数据,也不能对教学质量进行客观细致的评价。然而,在教育信息化的背景下,利用信息技术对课堂互动进行观察和分析,可以充分利用教学数据,对教学情况进行及时客观的反馈。本文主要对高校课堂互动行为进行分析。为了充分利用课堂音视频数据,构建了一个基于音视频的课堂互动行为分析框架。将课堂互动行为分为言语类和非言语类,并利用深度学习技术实现课堂互动自动化分析。主要工作和创新点如下:(1)结合传统课堂互动分析的理论基础和高效课堂对课堂质量评价的要求,构建了基于音视频的课堂互动行为分析框架。(2)改进了语言课堂互动行为分析任务中的说话人分割聚类算法,提出了LSTM和TDNN相结合的框架特征提取网络和基于双多头注意机制的时间池化网络。与DIHARD III基线网络相比,改进的说话人分割聚类算法在四类评价数据集上的说话人分离错误率(DER)分别降低了3.24%、3.19%、4.53%和4.14%。(3)针对非言语课堂交互行为分析任务中的人脸检测算法,提出了单级人脸检测网络FDN,设计了双向特征融合模块FPN+PANet、预测分支IoU感知和损失函数CIoU。与retaface相比,最终的FDN改善最为明显,在验证和测试设定困难目标上的平均精度(average precision, AP)分别提高了2.6%和2.7%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信