{"title":"Intelligent Multimodal Analysis Framework for Teacher-Student Interaction","authors":"Mengke Wang, Liang Luo, Zengzhao Chen, Qiuyu Zheng, Jiawen Li, Wei Gao","doi":"10.1109/IEIR56323.2022.10050044","DOIUrl":null,"url":null,"abstract":"This paper constructed a multi-modal analysis framework of teacher-student interaction based on intelligent technology. Voiceprint recognition was used to divide the teaching video into slices according to sentences and then used speech recognition, speech emotion analysis, gaze point estimation, and other technologies to recognize and encoded the multimodal behavior of each slice. We analyzed 10 lessons using the event sampling method proposed in the analysis framework in comparison with the classical temporal sampling analysis method and demonstrated the results of multimodal interaction analysis of an instructional video as an example. The results indicated that the event sampling method proposed not only reduces the number of analysis units but also has more complete information about the utterance of each unit, overcoming the incomplete information or information redundancy of analysis units caused by the mechanical segmentation of temporal sampling. The multimodal analysis showed that taking into account both teacher-student verbal and nonverbal interactions can reveal richer and deeper information about classroom teaching and learning. This framework provides an important reference for intelligent multimodal analysis of teacher-student interaction.","PeriodicalId":183709,"journal":{"name":"2022 International Conference on Intelligent Education and Intelligent Research (IEIR)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Intelligent Education and Intelligent Research (IEIR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IEIR56323.2022.10050044","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This paper constructed a multi-modal analysis framework of teacher-student interaction based on intelligent technology. Voiceprint recognition was used to divide the teaching video into slices according to sentences and then used speech recognition, speech emotion analysis, gaze point estimation, and other technologies to recognize and encoded the multimodal behavior of each slice. We analyzed 10 lessons using the event sampling method proposed in the analysis framework in comparison with the classical temporal sampling analysis method and demonstrated the results of multimodal interaction analysis of an instructional video as an example. The results indicated that the event sampling method proposed not only reduces the number of analysis units but also has more complete information about the utterance of each unit, overcoming the incomplete information or information redundancy of analysis units caused by the mechanical segmentation of temporal sampling. The multimodal analysis showed that taking into account both teacher-student verbal and nonverbal interactions can reveal richer and deeper information about classroom teaching and learning. This framework provides an important reference for intelligent multimodal analysis of teacher-student interaction.