Multilingual BLSTM and speaker-specific vector adaptation in 2016 but babel system

2016 IEEE Spoken Language Technology Workshop (SLT) Pub Date : 2016-12-01 DOI:10.1109/SLT.2016.7846330

M. Karafiát, M. Baskar, P. Matejka, Karel Veselý, F. Grézl, J. Černocký

引用次数: 23

Abstract

This paper provides an extensive summary of BUT 2016 system for the last IARPA Babel evaluations. It concentrates on multi-lingual training of both deep neural network (DNN)-based feature extraction and acoustic models including multilingual training of bidirectional Long Short Term memory networks. Next, two low-dimensional vector approaches to speaker adaptation are investigated: i-vectors and sequence-summarizing neural networks (SSNN). The results provided on three Babel Year 4 languages show clear advantage of both approaches in case limited amount of training data is available. The time necessary for the development of a new system is addressed too, as some of the investigated techniques do not require extensive re-training of the whole system.

查看原文本刊更多论文

2016年多语言BLSTM和特定说话人的载体适应，但巴别塔系统

本文对上届IARPA巴别塔评估的BUT 2016系统进行了广泛的总结。主要研究基于深度神经网络(DNN)的特征提取和声学模型的多语言训练，包括双向长短期记忆网络的多语言训练。接下来，研究了两种低维向量方法:i向量和序列汇总神经网络(SSNN)。对三种Babel四年级语文的结果表明，在训练数据有限的情况下，这两种方法都有明显的优势。开发新系统所需的时间也得到了解决，因为所研究的一些技术不需要对整个系统进行广泛的重新培训。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2016 IEEE Spoken Language Technology Workshop (SLT)

自引率

0.00%

发文量