Evaluation of methods to combine different speech recognizers

2015 Federated Conference on Computer Science and Information Systems (FedCSIS) Pub Date : 2015-11-09 DOI:10.15439/2015F62

Tomas Rasymas, V. Rudzionis

引用次数: 9

Abstract

The paper deals with the problem of improving speech recognition by combining outputs of several different recognizers. We are presenting our results obtained by experimenting with different classification methods which are suitable to combine outputs of different speech recognizers. Methods which were evaluated are: k-Nearest neighbors (KNN), Linear Discriminant Analysis (LDA), Quadratic Discriminant Analysis (QDA), Logistic Regression (LR) and maximum likelihood (ML). Results showed, that highest accuracy (98.16 %) was obtained when k-Nearest neighbors method was used with 15 nearest neighbors. In this case accuracy was increased by 7.78 % compared with best single recognizer result. In our experiments we tried to combine one native (Lithuanian language) and few foreign speech recognizers: Russian, English and two German recognizers. For the adaptation of foreign language speech recognizers we used text transcribing method which is based on formal rules. Our experiments proved, that recognition accuracy improves when few speech recognizers are combined.

查看原文本刊更多论文

不同语音识别器组合方法的评价

本文研究了结合多个不同识别器的输出来改进语音识别的问题。本文介绍了不同分类方法的实验结果，这些分类方法适用于不同语音识别器的输出组合。评价的方法有:k近邻分析(KNN)、线性判别分析(LDA)、二次判别分析(QDA)、逻辑回归(LR)和最大似然分析(ML)。结果表明，采用k近邻法选取15个近邻时，准确率最高，达到98.16%。在这种情况下，与最佳的单一识别器结果相比，准确率提高了7.78%。在我们的实验中，我们试图结合一个本地(立陶宛语)和几个外国语音识别器:俄语，英语和两个德语识别器。为了适应外语语音识别器，我们采用了基于形式规则的文本转录方法。我们的实验证明，当几个语音识别器组合在一起时，识别精度会提高。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2015 Federated Conference on Computer Science and Information Systems (FedCSIS)

自引率

0.00%

发文量