The language-independent bottleneck features

2012 IEEE Spoken Language Technology Workshop (SLT) Pub Date : 2012-12-01 DOI:10.1109/SLT.2012.6424246

Karel Veselý, M. Karafiát, F. Grézl, M. Janda, E. Egorova

引用次数: 216

Abstract

In this paper we present novel language-independent bottleneck (BN) feature extraction framework. In our experiments we have used Multilingual Artificial Neural Network (ANN), where each language is modelled by separate output layer, while all the hidden layers jointly model the variability of all the source languages. The key idea is that the entire ANN is trained on all the languages simultaneously, thus the BN-features are not biased towards any of the languages. Exactly for this reason, the final BN-features are considered as language independent. In the experiments with GlobalPhone database, we show that Multilingual BN-features consistently outperform Monolingual BN-features. Also, cross-lingual generalization is evaluated, where we train on 5 source languages and test on 3 other languages. The results show that the ANN can produce very good BN-features even for unseen languages, in some cases even better than if we trained the ANN on the target language only.

查看原文本刊更多论文

与语言无关的瓶颈特性

本文提出了一种新的语言无关瓶颈(BN)特征提取框架。在我们的实验中，我们使用了多语言人工神经网络(ANN)，其中每种语言由单独的输出层建模，而所有隐藏层共同建模所有源语言的可变性。关键思想是整个人工神经网络同时在所有语言上进行训练，因此bn特征不会偏向于任何语言。正是由于这个原因，最终的bn特性被认为是独立于语言的。在GlobalPhone数据库的实验中，我们证明了多语言bn特征始终优于单语言bn特征。此外，我们还评估了跨语言泛化，我们在5种源语言上进行训练，并在3种其他语言上进行测试。结果表明，即使对于未知的语言，人工神经网络也能产生非常好的bn特征，在某些情况下，甚至比我们只在目标语言上训练人工神经网络还要好。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2012 IEEE Spoken Language Technology Workshop (SLT)

自引率

0.00%

发文量