Training Language Models for Long-Span Cross-Sentence Evaluation

2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) Pub Date : 2019-12-01 DOI:10.1109/ASRU46091.2019.9003788

Kazuki Irie, Albert Zeyer, R. Schlüter, H. Ney

引用次数: 38

Abstract

While recurrent neural networks can motivate cross-sentence language modeling and its application to automatic speech recognition (ASR), corresponding modifications of the training method for that end are rarely discussed. In fact, even more generally, the impact of training sequence construction strategy in language modeling for different evaluation conditions is typically ignored. In this work, we revisit this basic but fundamental question. We train language models based on long short-term memory recurrent neural networks and Transformers using various types of training sequences and study their robustness with respect to different evaluation modes. Our experiments on 300h Switchboard and Quaero English datasets show that models trained with back-propagation over sequences consisting of concatenation of multiple sentences with state carry-over across sequences effectively outperform those trained with the sentence-level training, both in terms of perplexity and word error rates for cross-utterance ASR.

查看原文本刊更多论文

大跨度跨句评价的语言模型训练

虽然递归神经网络可以激发跨句语言建模及其在自动语音识别(ASR)中的应用，但为此目的对训练方法的相应修改却很少被讨论。事实上，更普遍的是，训练序列构建策略在不同评估条件下的语言建模中的影响通常被忽略。在这项工作中，我们重新审视了这个基本但基本的问题。我们使用不同类型的训练序列训练基于长短期记忆递归神经网络和变形器的语言模型，并研究了它们在不同评估模式下的鲁棒性。我们在300h交换机和Quaero英语数据集上的实验表明，在由多个句子串联而成的序列上进行反向传播训练的模型，在交叉话语ASR的困惑度和单词错误率方面，都有效地优于用句子级训练训练的模型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)

自引率

0.00%

发文量