MIT-QCRI Arabic dialect identification system for the 2017 multi-genre broadcast challenge

2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) Pub Date : 2017-08-28 DOI:10.1109/ASRU.2017.8268960

Suwon Shon, Ahmed M. Ali, James R. Glass

引用次数: 22

Abstract

In order to successfully annotate the Arabic speech content found in open-domain media broadcasts, it is essential to be able to process a diverse set of Arabic dialects. For the 2017 Multi-Genre Broadcast challenge (MGB-3) there were two possible tasks: Arabic speech recognition, and Arabic Dialect Identification (ADI). In this paper, we describe our efforts to create an ADI system for the MGB-3 challenge, with the goal of distinguishing amongst four major Arabic dialects, as well as Modern Standard Arabic. Our research focused on dialect variability and domain mismatches between the training and test domain. In order to achieve a robust ADI system, we explored both Siamese neural network models to learn similarity and dissimilarities among Arabic dialects, as well as i-vector post-processing to adapt domain mismatches. Both Acoustic and linguistic features were used for the final MGB-3 submissions, with the best primary system achieving 75% accuracy on the official 10hr test set.

查看原文本刊更多论文

麻省理工学院- qcri阿拉伯语方言识别系统2017年多类型广播挑战

为了成功地对开放域媒体广播中的阿拉伯语语音内容进行注释，必须能够处理多种阿拉伯语方言。对于2017年多类型广播挑战(MGB-3)，有两个可能的任务:阿拉伯语语音识别和阿拉伯语方言识别(ADI)。在本文中，我们描述了我们为MGB-3挑战创建ADI系统的努力，目标是区分四种主要的阿拉伯语方言以及现代标准阿拉伯语。我们的研究重点是方言变异和训练域与测试域之间的域不匹配。为了实现鲁棒的ADI系统，我们探索了Siamese神经网络模型来学习阿拉伯语方言之间的相似性和差异性，以及i向量后处理来适应域不匹配。声学和语言功能都用于最终的MGB-3提交，最好的主系统在官方10小时测试集上达到75%的准确率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)

自引率

0.00%

发文量