基于内容的非母语自发语音评分的提示感知神经网络方法

2018 IEEE Spoken Language Technology Workshop (SLT) Pub Date : 2018-12-01 DOI:10.1109/SLT.2018.8639697

Yao Qian, Rutuja Ubale, Matthew David Mulholland, Keelan Evanini, Xinhao Wang

{"title":"基于内容的非母语自发语音评分的提示感知神经网络方法","authors":"Yao Qian, Rutuja Ubale, Matthew David Mulholland, Keelan Evanini, Xinhao Wang","doi":"10.1109/SLT.2018.8639697","DOIUrl":null,"url":null,"abstract":"We present a neural network approach to the automated assessment of non-native spontaneous speech in a listen and speak task. An attention-based Long Short-Term Memory (LSTM) Recurrent Neural Network (RNN) is used to learn the relations (scoring rubrics) between the spoken responses and their assigned scores. Each prompt (listening material) is encoded as a vector in a low-dimensional space and then employed as a condition of the inputs of the attention LSTM-RNN. The experimental results show that our approach performs as well as the strong baseline of a Support Vector Regressor (SVR) using content-related features, i.e., a correlation of r = 0.806 with holistic proficiency scores provided by humans, without doing any feature engineering. The prompt-encoded vector improves the discrimination between the high-scoring sample and low-scoring sample, and it is more effective in grading responses to unseen prompts, which have no corresponding responses in the training set.","PeriodicalId":377307,"journal":{"name":"2018 IEEE Spoken Language Technology Workshop (SLT)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":"{\"title\":\"A Prompt-Aware Neural Network Approach to Content-Based Scoring of Non-Native Spontaneous Speech\",\"authors\":\"Yao Qian, Rutuja Ubale, Matthew David Mulholland, Keelan Evanini, Xinhao Wang\",\"doi\":\"10.1109/SLT.2018.8639697\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We present a neural network approach to the automated assessment of non-native spontaneous speech in a listen and speak task. An attention-based Long Short-Term Memory (LSTM) Recurrent Neural Network (RNN) is used to learn the relations (scoring rubrics) between the spoken responses and their assigned scores. Each prompt (listening material) is encoded as a vector in a low-dimensional space and then employed as a condition of the inputs of the attention LSTM-RNN. The experimental results show that our approach performs as well as the strong baseline of a Support Vector Regressor (SVR) using content-related features, i.e., a correlation of r = 0.806 with holistic proficiency scores provided by humans, without doing any feature engineering. The prompt-encoded vector improves the discrimination between the high-scoring sample and low-scoring sample, and it is more effective in grading responses to unseen prompts, which have no corresponding responses in the training set.\",\"PeriodicalId\":377307,\"journal\":{\"name\":\"2018 IEEE Spoken Language Technology Workshop (SLT)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"12\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 IEEE Spoken Language Technology Workshop (SLT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SLT.2018.8639697\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE Spoken Language Technology Workshop (SLT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SLT.2018.8639697","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 12

摘要

我们提出了一种神经网络方法来自动评估听和说任务中的非母语自发语音。使用基于注意的长短期记忆递归神经网络(RNN)来学习口语回答与其指定分数之间的关系(评分规则)。每个提示(听力材料)被编码为低维空间中的向量，然后作为注意力LSTM-RNN输入的条件。实验结果表明，我们的方法在没有进行任何特征工程的情况下，与使用内容相关特征的支持向量回归器(SVR)的强基线表现一样好，即与人类提供的整体熟练度分数的相关性为r = 0.806。提示编码向量提高了高分样本和低分样本的区分能力，并且对训练集中没有对应响应的未见提示的响应进行分级更有效。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Prompt-Aware Neural Network Approach to Content-Based Scoring of Non-Native Spontaneous Speech

We present a neural network approach to the automated assessment of non-native spontaneous speech in a listen and speak task. An attention-based Long Short-Term Memory (LSTM) Recurrent Neural Network (RNN) is used to learn the relations (scoring rubrics) between the spoken responses and their assigned scores. Each prompt (listening material) is encoded as a vector in a low-dimensional space and then employed as a condition of the inputs of the attention LSTM-RNN. The experimental results show that our approach performs as well as the strong baseline of a Support Vector Regressor (SVR) using content-related features, i.e., a correlation of r = 0.806 with holistic proficiency scores provided by humans, without doing any feature engineering. The prompt-encoded vector improves the discrimination between the high-scoring sample and low-scoring sample, and it is more effective in grading responses to unseen prompts, which have no corresponding responses in the training set.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2018 IEEE Spoken Language Technology Workshop (SLT)

自引率

0.00%

发文量