Language modeling with neural trans-dimensional random fields

2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) Pub Date : 2017-07-23 DOI:10.1109/ASRU.2017.8268949

Bin Wang, Zhijian Ou

{"title":"Language modeling with neural trans-dimensional random fields","authors":"Bin Wang, Zhijian Ou","doi":"10.1109/ASRU.2017.8268949","DOIUrl":null,"url":null,"abstract":"Trans-dimensional random field language models (TRF LMs) have recently been introduced, where sentences are modeled as a collection of random fields. The TRF approach has been shown to have the advantages of being computationally more efficient in inference than LSTM LMs with close performance and being able to flexibly integrate rich features. In this paper we propose neural TRFs, beyond of the previous discrete TRFs that only use linear potentials with discrete features. The idea is to use nonlinear potentials with continuous features, implemented by neural networks (NNs), in the TRF framework. Neural TRFs combine the advantages of both NNs and TRFs. The benefits of word embedding, nonlinear feature learning and larger context modeling are inherited from the use of NNs. At the same time, the strength of efficient inference by avoiding expensive softmax is preserved. A number of technical contributions, including employing deep convolutional neural networks (CNNs) to define the potentials and incorporating the joint stochastic approximation (JSA) strategy in the training algorithm, are developed in this work, which enable us to successfully train neural TRF LMs. Various LMs are evaluated in terms of speech recognition WERs by rescoring the 1000-best lists of WSJ'92 test data. The results show that neural TRF LMs not only improve over discrete TRF LMs, but also perform slightly better than LSTM LMs with only one fifth of parameters and 16x faster inference efficiency.","PeriodicalId":290868,"journal":{"name":"2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASRU.2017.8268949","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 14

Abstract

Trans-dimensional random field language models (TRF LMs) have recently been introduced, where sentences are modeled as a collection of random fields. The TRF approach has been shown to have the advantages of being computationally more efficient in inference than LSTM LMs with close performance and being able to flexibly integrate rich features. In this paper we propose neural TRFs, beyond of the previous discrete TRFs that only use linear potentials with discrete features. The idea is to use nonlinear potentials with continuous features, implemented by neural networks (NNs), in the TRF framework. Neural TRFs combine the advantages of both NNs and TRFs. The benefits of word embedding, nonlinear feature learning and larger context modeling are inherited from the use of NNs. At the same time, the strength of efficient inference by avoiding expensive softmax is preserved. A number of technical contributions, including employing deep convolutional neural networks (CNNs) to define the potentials and incorporating the joint stochastic approximation (JSA) strategy in the training algorithm, are developed in this work, which enable us to successfully train neural TRF LMs. Various LMs are evaluated in terms of speech recognition WERs by rescoring the 1000-best lists of WSJ'92 test data. The results show that neural TRF LMs not only improve over discrete TRF LMs, but also perform slightly better than LSTM LMs with only one fifth of parameters and 16x faster inference efficiency.

查看原文本刊更多论文

基于神经跨维随机场的语言建模

最近引入了跨维随机场语言模型(TRF LMs)，其中将句子建模为随机场的集合。TRF方法在推理方面具有比LSTM LMs计算效率更高的优点，具有相近的性能，并且能够灵活地集成丰富的特征。在本文中，我们提出了神经trf，超越了以前只使用具有离散特征的线性势的离散trf。其思想是在TRF框架中使用具有连续特征的非线性势，由神经网络(NNs)实现。神经trf结合了神经网络和trf的优点。词嵌入、非线性特征学习和更大上下文建模的优点都继承自神经网络的使用。同时，通过避免昂贵的softmax来保持有效推理的强度。在这项工作中，我们开发了许多技术贡献，包括使用深度卷积神经网络(cnn)来定义电位，并在训练算法中结合联合随机逼近(JSA)策略，这使我们能够成功地训练神经TRF lm。通过对WSJ'92测试数据的1000个最佳列表进行评分，根据语音识别wer对各种lm进行评估。结果表明，神经TRF LMs不仅比离散TRF LMs有所提高，而且在参数减少1 / 5的情况下，推理效率比LSTM LMs略好，提高了16倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)

自引率

0.00%

发文量