Towards Slovak-English-Mandarin Speech Recognition Using Deep Learning

2018 International Symposium ELMAR Pub Date : 2018-09-01 DOI:10.23919/ELMAR.2018.8534661

Matus Pleva, Y. Liao, Wu-Hua Hsu, D. Hládek, J. Staš, P. Viszlay, M. Lojka, J. Juhár

引用次数: 4

Abstract

This paper describes the progress of the development of multilingual speech enabled interface by exploring the state-of-the-art deep learning techniques in the frame of the bilateral project named “Deep Learning for Advanced Speech Enabled Applications”. The advancement is especially expected in automatic subtitling of broadcast television and radio programs, databases creation, indexing and information retrieval. This implies investigation of deep learning techniques in the following sub-tasks: a) multilingual large vocabulary continuous speech recognition, b) audio events detection, c) speaker clustering and diarization, d) spoken discourse, speech, paragraph and sentence segmentation, e) emotion recognition and f) microphone array/multi-channel speech enhancement, g) data mining, h) multilingual speech synthesis, and i) spoken dialogue user interfaces. This paper describes the current work, description of the available data in the project and achieved results in the first task of Slovak speech recognition Kaldi module using deep learning algorithms.

查看原文本刊更多论文

基于深度学习的斯洛伐克语-英语-汉语语音识别研究

本文通过在名为“高级语音支持应用的深度学习”的双边项目框架中探索最先进的深度学习技术，描述了多语言语音支持界面的开发进展。特别是在广播电视节目的自动字幕、数据库的建立、索引和信息检索等方面。这意味着在以下子任务中研究深度学习技术:a)多语言大词汇连续语音识别，b)音频事件检测，c)说话人聚类和分类，d)口语话语，语音，段落和句子分割，e)情感识别和f)麦克风阵列/多通道语音增强，g)数据挖掘，h)多语言语音合成，i)口语对话用户界面。本文描述了目前的工作，描述了项目中可用的数据以及在斯洛伐克语语音识别Kaldi模块的第一个任务中使用深度学习算法取得的成果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2018 International Symposium ELMAR

自引率

0.00%

发文量