QCRI advanced transcription system (QATS) for the Arabic Multi-Dialect Broadcast media recognition: MGB-2 challenge

2016 IEEE Spoken Language Technology Workshop (SLT) Pub Date : 2016-12-01 DOI:10.1109/SLT.2016.7846279

Sameer Khurana, Ahmed M. Ali

引用次数: 48

Abstract

In this paper, we describe Qatar Computing Research Institute's (QCRI) speech transcription system for the 2016 Dialectal Arabic Multi-Genre Broadcast (MGB-2) challenge. MGB-2 is a controlled evaluation using 1,200 hours audio with lightly supervised transcription Our system which was a combination of three purely sequence trained recognition systems, achieved the lowest WER of 14.2% among the nine participating teams. Key features of our transcription system are: purely sequence trained acoustic models using the recently introduced Lattice free Maximum Mutual Information (LF-MMI) modeling framework; Language model rescoring using a four-gram and Recurrent Neural Network with Max- Ent connections (RNNME) language models; and system combination using Minimum Bayes Risk (MBR) decoding criterion. The whole system is built using kaldi speech recognition toolkit.

查看原文本刊更多论文

用于阿拉伯语多方言广播媒体识别的QCRI高级转录系统(QATS): MGB-2的挑战

在本文中，我们描述了卡塔尔计算研究所(QCRI)的语音转录系统，用于2016年阿拉伯方言多类型广播(MGB-2)挑战。MGB-2是使用1200小时音频和轻度监督转录的受控评估。我们的系统是三个纯粹序列训练识别系统的组合，在九个参与团队中实现了最低的14.2%的WER。我们的转录系统的主要特点是:使用最近引入的晶格自由最大互信息(LF-MMI)建模框架的纯序列训练声学模型;基于四元神经网络和循环神经网络(RNNME)语言模型的语言模型重建采用最小贝叶斯风险(MBR)解码准则进行系统组合。整个系统采用kaldi语音识别工具箱构建。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2016 IEEE Spoken Language Technology Workshop (SLT)

自引率

0.00%

发文量