Spoken Language Identification with Deep Temporal Neural Network and Multi-levels Discriminative Cues

Linjia Sun
{"title":"Spoken Language Identification with Deep Temporal Neural Network and Multi-levels Discriminative Cues","authors":"Linjia Sun","doi":"10.1109/ICICSP50920.2020.9232093","DOIUrl":null,"url":null,"abstract":"The language cue is an important component in the task of spoken language identification (LID). But it will take a lot of time to align language cue to speech segment by the manual annotation of professional linguists. Instead of annotating the linguistic phonemes, we use the cooccurrence in speech utterances to find the underlying phoneme-like speech units by unsupervised means. Then, we model phonotactic constraint on the set of phoneme-like units for finding the larger speech segments called the suprasegmental phonemes, and extract the multi-levels language cues from them, including phonetic, phonotactic and prosodic. Further, a novel LID system is proposed based on the architecture of TDNN followed by LSTM-RNN. The proposed LID system is built and compared with the acoustic feature based methods and the phonetic feature based methods on the task of NIST LRE07 and Arabic dialect identification. The experimental results show that our LID system helps to capture robust discriminative information for short duration language identification and high accuracy for dialect identification.","PeriodicalId":117760,"journal":{"name":"2020 IEEE 3rd International Conference on Information Communication and Signal Processing (ICICSP)","volume":"430 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 3rd International Conference on Information Communication and Signal Processing (ICICSP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICICSP50920.2020.9232093","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

The language cue is an important component in the task of spoken language identification (LID). But it will take a lot of time to align language cue to speech segment by the manual annotation of professional linguists. Instead of annotating the linguistic phonemes, we use the cooccurrence in speech utterances to find the underlying phoneme-like speech units by unsupervised means. Then, we model phonotactic constraint on the set of phoneme-like units for finding the larger speech segments called the suprasegmental phonemes, and extract the multi-levels language cues from them, including phonetic, phonotactic and prosodic. Further, a novel LID system is proposed based on the architecture of TDNN followed by LSTM-RNN. The proposed LID system is built and compared with the acoustic feature based methods and the phonetic feature based methods on the task of NIST LRE07 and Arabic dialect identification. The experimental results show that our LID system helps to capture robust discriminative information for short duration language identification and high accuracy for dialect identification.
基于深度颞叶神经网络和多层次判别线索的口语识别
语言线索是口语识别任务的重要组成部分。但是,通过专业语言学家的手工注释,将语言线索与语音片段对齐需要花费大量时间。我们不是对语言音素进行标注,而是利用语音话语中的共现现象,通过无监督的方法来寻找潜在的类音素语音单位。然后,我们在类音素单元集上建立音致约束模型,用于寻找更大的语音片段(称为超音段音素),并从中提取多层次的语言线索,包括语音、音致和韵律。在此基础上,提出了一种基于TDNN和LSTM-RNN的LID系统。以NIST LRE07和阿拉伯语方言识别为任务,建立了基于声学特征和语音特征的LID系统,并与基于声学特征和语音特征的方法进行了比较。实验结果表明,该系统能够较好地捕获短时间语言识别的鲁棒性判别信息,并具有较高的方言识别准确率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信