Complementary Systems for Off-Topic Spoken Response Detection

Workshop on Innovative Use of NLP for Building Educational Applications Pub Date : 2020-07-01 DOI:10.18653/v1/2020.bea-1.4

V. Raina, M. Gales, K. Knill

{"title":"Complementary Systems for Off-Topic Spoken Response Detection","authors":"V. Raina, M. Gales, K. Knill","doi":"10.18653/v1/2020.bea-1.4","DOIUrl":null,"url":null,"abstract":"Increased demand to learn English for business and education has led to growing interest in automatic spoken language assessment and teaching systems. With this shift to automated approaches it is important that systems reliably assess all aspects of a candidate’s responses. This paper examines one form of spoken language assessment; whether the response from the candidate is relevant to the prompt provided. This will be referred to as off-topic spoken response detection. Two forms of previously proposed approaches are examined in this work: the hierarchical attention-based topic model (HATM); and the similarity grid model (SGM). The work focuses on the scenario when the prompt, and associated responses, have not been seen in the training data, enabling the system to be applied to new test scripts without the need to collect data or retrain the model. To improve the performance of the systems for unseen prompts, data augmentation based on easy data augmentation (EDA) and translation based approaches are applied. Additionally for the HATM, a form of prompt dropout is described. The systems were evaluated on both seen and unseen prompts from Linguaskill Business and General English tests. For unseen data the performance of the HATM was improved using data augmentation, in contrast to the SGM where no gains were obtained. The two approaches were found to be complementary to one another, yielding a combined F0.5 score of 0.814 for off-topic response detection where the prompts have not been seen in training.","PeriodicalId":363390,"journal":{"name":"Workshop on Innovative Use of NLP for Building Educational Applications","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Workshop on Innovative Use of NLP for Building Educational Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18653/v1/2020.bea-1.4","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

Abstract

Increased demand to learn English for business and education has led to growing interest in automatic spoken language assessment and teaching systems. With this shift to automated approaches it is important that systems reliably assess all aspects of a candidate’s responses. This paper examines one form of spoken language assessment; whether the response from the candidate is relevant to the prompt provided. This will be referred to as off-topic spoken response detection. Two forms of previously proposed approaches are examined in this work: the hierarchical attention-based topic model (HATM); and the similarity grid model (SGM). The work focuses on the scenario when the prompt, and associated responses, have not been seen in the training data, enabling the system to be applied to new test scripts without the need to collect data or retrain the model. To improve the performance of the systems for unseen prompts, data augmentation based on easy data augmentation (EDA) and translation based approaches are applied. Additionally for the HATM, a form of prompt dropout is described. The systems were evaluated on both seen and unseen prompts from Linguaskill Business and General English tests. For unseen data the performance of the HATM was improved using data augmentation, in contrast to the SGM where no gains were obtained. The two approaches were found to be complementary to one another, yielding a combined F0.5 score of 0.814 for off-topic response detection where the prompts have not been seen in training.

查看原文本刊更多论文

离题口语反应检测的补充系统

商业和教育对学习英语的需求增加，导致人们对自动口语评估和教学系统的兴趣日益浓厚。随着向自动化方法的转变，系统可靠地评估候选人回答的各个方面是很重要的。本文探讨了口语评估的一种形式;候选人的回答是否与所提供的提示相关。这将被称为偏离主题的语音响应检测。在这项工作中，研究了先前提出的两种形式的方法:分层的基于注意力的主题模型(HATM);相似网格模型(SGM)。当提示和相关的响应没有在训练数据中出现时，工作集中在场景上，使系统能够应用于新的测试脚本，而不需要收集数据或重新训练模型。为了提高系统对未见提示的性能，采用了基于简易数据增强(EDA)和基于翻译的数据增强方法。此外，对于HATM，还描述了一种提示退出形式。这些系统是根据Linguaskill商务英语和通用英语测试中可见和未见的提示进行评估的。对于看不见的数据，HATM的性能通过数据增强得到了改善，而SGM则没有获得任何增益。这两种方法被发现是互补的，在训练中没有看到提示的情况下，脱题响应检测的F0.5得分为0.814。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Workshop on Innovative Use of NLP for Building Educational Applications

自引率

0.00%

发文量