Words matter: automatic detection of teacher questions in live classroom discourse using linguistics, acoustics, and context

Proceedings of the Seventh International Learning Analytics & Knowledge Conference Pub Date : 2017-03-13 DOI:10.1145/3027385.3027417

P. Donnelly, Nathaniel Blanchard, A. Olney, Sean Kelly, M. Nystrand, S. D’Mello

{"title":"Words matter: automatic detection of teacher questions in live classroom discourse using linguistics, acoustics, and context","authors":"P. Donnelly, Nathaniel Blanchard, A. Olney, Sean Kelly, M. Nystrand, S. D’Mello","doi":"10.1145/3027385.3027417","DOIUrl":null,"url":null,"abstract":"We investigate automatic detection of teacher questions from audio recordings collected in live classrooms with the goal of providing automated feedback to teachers. Using a dataset of audio recordings from 11 teachers across 37 class sessions, we automatically segment the audio into individual teacher utterances and code each as containing a question or not. We train supervised machine learning models to detect the human-coded questions using high-level linguistic features extracted from automatic speech recognition (ASR) transcripts, acoustic and prosodic features from the audio recordings, as well as context features, such as timing and turn-taking dynamics. Models are trained and validated independently of the teacher to ensure generalization to new teachers. We are able to distinguish questions and non-questions with a weighted F1 score of 0.69. A comparison of the three feature sets indicates that a model using linguistic features outperforms those using acoustic-prosodic and context features for question detection, but the combination of features yields a 5% improvement in overall accuracy compared to linguistic features alone. We discuss applications for pedagogical research, teacher formative assessment, and teacher professional development.","PeriodicalId":160897,"journal":{"name":"Proceedings of the Seventh International Learning Analytics & Knowledge Conference","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"42","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Seventh International Learning Analytics & Knowledge Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3027385.3027417","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 42

Abstract

We investigate automatic detection of teacher questions from audio recordings collected in live classrooms with the goal of providing automated feedback to teachers. Using a dataset of audio recordings from 11 teachers across 37 class sessions, we automatically segment the audio into individual teacher utterances and code each as containing a question or not. We train supervised machine learning models to detect the human-coded questions using high-level linguistic features extracted from automatic speech recognition (ASR) transcripts, acoustic and prosodic features from the audio recordings, as well as context features, such as timing and turn-taking dynamics. Models are trained and validated independently of the teacher to ensure generalization to new teachers. We are able to distinguish questions and non-questions with a weighted F1 score of 0.69. A comparison of the three feature sets indicates that a model using linguistic features outperforms those using acoustic-prosodic and context features for question detection, but the combination of features yields a 5% improvement in overall accuracy compared to linguistic features alone. We discuss applications for pedagogical research, teacher formative assessment, and teacher professional development.

查看原文本刊更多论文

词语问题:使用语言学、声学和语境在课堂话语中自动检测教师问题

我们研究了从现场教室中收集的录音中自动检测教师问题的方法，目的是为教师提供自动反馈。使用11位教师在37节课上的录音数据集，我们自动将音频分割成单个教师的话语，并将每个声音编码为包含问题或不包含问题。我们训练有监督的机器学习模型，使用从自动语音识别(ASR)转录本中提取的高级语言特征、录音中的声学和韵律特征以及上下文特征(如定时和轮流动力学)来检测人类编码的问题。模型的训练和验证独立于教师，以确保推广到新教师。我们能够区分问题和非问题，加权F1得分为0.69。对三个特征集的比较表明，使用语言特征的模型在问题检测方面优于使用声学韵律和上下文特征的模型，但与单独使用语言特征相比，特征组合的整体准确性提高了5%。我们讨论了在教学研究、教师形成性评估和教师专业发展方面的应用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the Seventh International Learning Analytics & Knowledge Conference

自引率

0.00%

发文量