Towards Slovak-English-Mandarin Speech Recognition Using Deep Learning

Matus Pleva, Y. Liao, Wu-Hua Hsu, D. Hládek, J. Staš, P. Viszlay, M. Lojka, J. Juhár
{"title":"Towards Slovak-English-Mandarin Speech Recognition Using Deep Learning","authors":"Matus Pleva, Y. Liao, Wu-Hua Hsu, D. Hládek, J. Staš, P. Viszlay, M. Lojka, J. Juhár","doi":"10.23919/ELMAR.2018.8534661","DOIUrl":null,"url":null,"abstract":"This paper describes the progress of the development of multilingual speech enabled interface by exploring the state-of-the-art deep learning techniques in the frame of the bilateral project named “Deep Learning for Advanced Speech Enabled Applications”. The advancement is especially expected in automatic subtitling of broadcast television and radio programs, databases creation, indexing and information retrieval. This implies investigation of deep learning techniques in the following sub-tasks: a) multilingual large vocabulary continuous speech recognition, b) audio events detection, c) speaker clustering and diarization, d) spoken discourse, speech, paragraph and sentence segmentation, e) emotion recognition and f) microphone array/multi-channel speech enhancement, g) data mining, h) multilingual speech synthesis, and i) spoken dialogue user interfaces. This paper describes the current work, description of the available data in the project and achieved results in the first task of Slovak speech recognition Kaldi module using deep learning algorithms.","PeriodicalId":175742,"journal":{"name":"2018 International Symposium ELMAR","volume":"544 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 International Symposium ELMAR","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/ELMAR.2018.8534661","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

This paper describes the progress of the development of multilingual speech enabled interface by exploring the state-of-the-art deep learning techniques in the frame of the bilateral project named “Deep Learning for Advanced Speech Enabled Applications”. The advancement is especially expected in automatic subtitling of broadcast television and radio programs, databases creation, indexing and information retrieval. This implies investigation of deep learning techniques in the following sub-tasks: a) multilingual large vocabulary continuous speech recognition, b) audio events detection, c) speaker clustering and diarization, d) spoken discourse, speech, paragraph and sentence segmentation, e) emotion recognition and f) microphone array/multi-channel speech enhancement, g) data mining, h) multilingual speech synthesis, and i) spoken dialogue user interfaces. This paper describes the current work, description of the available data in the project and achieved results in the first task of Slovak speech recognition Kaldi module using deep learning algorithms.
基于深度学习的斯洛伐克语-英语-汉语语音识别研究
本文通过在名为“高级语音支持应用的深度学习”的双边项目框架中探索最先进的深度学习技术,描述了多语言语音支持界面的开发进展。特别是在广播电视节目的自动字幕、数据库的建立、索引和信息检索等方面。这意味着在以下子任务中研究深度学习技术:a)多语言大词汇连续语音识别,b)音频事件检测,c)说话人聚类和分类,d)口语话语,语音,段落和句子分割,e)情感识别和f)麦克风阵列/多通道语音增强,g)数据挖掘,h)多语言语音合成,i)口语对话用户界面。本文描述了目前的工作,描述了项目中可用的数据以及在斯洛伐克语语音识别Kaldi模块的第一个任务中使用深度学习算法取得的成果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信