零资源ASR的概率词汇建模与无监督训练

2013 IEEE Workshop on Automatic Speech Recognition and Understanding Pub Date : 2013-12-01 DOI:10.1109/ASRU.2013.6707771

Ramya Rasipuram, Marzieh Razavi, M. Magimai.-Doss

{"title":"零资源ASR的概率词汇建模与无监督训练","authors":"Ramya Rasipuram, Marzieh Razavi, M. Magimai.-Doss","doi":"10.1109/ASRU.2013.6707771","DOIUrl":null,"url":null,"abstract":"Standard automatic speech recognition (ASR) systems rely on transcribed speech, language models, and pronunciation dictionaries to achieve state-of-the-art performance. The unavailability of these resources constrains the ASR technology to be available for many languages. In this paper, we propose a novel zero-resourced ASR approach to train acoustic models that only uses list of probable words from the language of interest. The proposed approach is based on Kullback-Leibler divergence based hidden Markov model (KL-HMM), grapheme subword units, knowledge of grapheme-to-phoneme mapping, and graphemic constraints derived from the word list. The approach also exploits existing acoustic and lexical resources available in other resource rich languages. Furthermore, we propose unsupervised adaptation of KL-HMM acoustic model parameters if untranscribed speech data in the target language is available. We demonstrate the potential of the proposed approach through a simulated study on Greek language.","PeriodicalId":265258,"journal":{"name":"2013 IEEE Workshop on Automatic Speech Recognition and Understanding","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Probabilistic lexical modeling and unsupervised training for zero-resourced ASR\",\"authors\":\"Ramya Rasipuram, Marzieh Razavi, M. Magimai.-Doss\",\"doi\":\"10.1109/ASRU.2013.6707771\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Standard automatic speech recognition (ASR) systems rely on transcribed speech, language models, and pronunciation dictionaries to achieve state-of-the-art performance. The unavailability of these resources constrains the ASR technology to be available for many languages. In this paper, we propose a novel zero-resourced ASR approach to train acoustic models that only uses list of probable words from the language of interest. The proposed approach is based on Kullback-Leibler divergence based hidden Markov model (KL-HMM), grapheme subword units, knowledge of grapheme-to-phoneme mapping, and graphemic constraints derived from the word list. The approach also exploits existing acoustic and lexical resources available in other resource rich languages. Furthermore, we propose unsupervised adaptation of KL-HMM acoustic model parameters if untranscribed speech data in the target language is available. We demonstrate the potential of the proposed approach through a simulated study on Greek language.\",\"PeriodicalId\":265258,\"journal\":{\"name\":\"2013 IEEE Workshop on Automatic Speech Recognition and Understanding\",\"volume\":\"21 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 IEEE Workshop on Automatic Speech Recognition and Understanding\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ASRU.2013.6707771\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 IEEE Workshop on Automatic Speech Recognition and Understanding","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASRU.2013.6707771","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 7

摘要

标准的自动语音识别(ASR)系统依赖于转录语音、语言模型和发音字典来实现最先进的性能。这些资源的不可用性限制了ASR技术对许多语言的可用性。在本文中，我们提出了一种新的零资源ASR方法来训练声学模型，该模型仅使用感兴趣语言中的可能单词列表。该方法基于基于Kullback-Leibler散度的隐马尔可夫模型(KL-HMM)、字素子词单元、字素到音素映射的知识以及从单词列表中导出的字素约束。该方法还利用了其他资源丰富的语言中现有的声学和词汇资源。此外，如果目标语言中有未转录的语音数据，我们建议对KL-HMM声学模型参数进行无监督自适应。我们通过对希腊语的模拟研究证明了所提出方法的潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Probabilistic lexical modeling and unsupervised training for zero-resourced ASR

Standard automatic speech recognition (ASR) systems rely on transcribed speech, language models, and pronunciation dictionaries to achieve state-of-the-art performance. The unavailability of these resources constrains the ASR technology to be available for many languages. In this paper, we propose a novel zero-resourced ASR approach to train acoustic models that only uses list of probable words from the language of interest. The proposed approach is based on Kullback-Leibler divergence based hidden Markov model (KL-HMM), grapheme subword units, knowledge of grapheme-to-phoneme mapping, and graphemic constraints derived from the word list. The approach also exploits existing acoustic and lexical resources available in other resource rich languages. Furthermore, we propose unsupervised adaptation of KL-HMM acoustic model parameters if untranscribed speech data in the target language is available. We demonstrate the potential of the proposed approach through a simulated study on Greek language.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2013 IEEE Workshop on Automatic Speech Recognition and Understanding

自引率

0.00%

发文量