从平行多语言语音中恢复首字母缩略词、格子外词和发音

2012 IEEE Spoken Language Technology Workshop (SLT) Pub Date : 2012-12-01 DOI:10.1109/SLT.2012.6424248

João Miranda, J. Neto, A. Black

{"title":"从平行多语言语音中恢复首字母缩略词、格子外词和发音","authors":"João Miranda, J. Neto, A. Black","doi":"10.1109/SLT.2012.6424248","DOIUrl":null,"url":null,"abstract":"In this work we present a set of techniques which explore information from multiple, different language versions of the same speech, to improve Automatic Speech Recognition (ASR) performance. Using this redundant information we are able to recover acronyms, words that cannot be found in the multiple hypotheses produced by the ASR systems, and pronunciations absent from their pronunciation dictionaries. When used together, the three techniques yield a relative improvement of 5.0% over the WER of our baseline system, and 24.8% relative when compared with standard speech recognition, in an Europarl Committee dataset with three different languages (Portuguese, Spanish and English). One full iteration of the system has a parallel Real Time Factor (RTF) of 3.08 and a sequential RTF of 6.44.","PeriodicalId":375378,"journal":{"name":"2012 IEEE Spoken Language Technology Workshop (SLT)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Recovery of acronyms, out-of-lattice words and pronunciations from parallel multilingual speech\",\"authors\":\"João Miranda, J. Neto, A. Black\",\"doi\":\"10.1109/SLT.2012.6424248\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this work we present a set of techniques which explore information from multiple, different language versions of the same speech, to improve Automatic Speech Recognition (ASR) performance. Using this redundant information we are able to recover acronyms, words that cannot be found in the multiple hypotheses produced by the ASR systems, and pronunciations absent from their pronunciation dictionaries. When used together, the three techniques yield a relative improvement of 5.0% over the WER of our baseline system, and 24.8% relative when compared with standard speech recognition, in an Europarl Committee dataset with three different languages (Portuguese, Spanish and English). One full iteration of the system has a parallel Real Time Factor (RTF) of 3.08 and a sequential RTF of 6.44.\",\"PeriodicalId\":375378,\"journal\":{\"name\":\"2012 IEEE Spoken Language Technology Workshop (SLT)\",\"volume\":\"3 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 IEEE Spoken Language Technology Workshop (SLT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SLT.2012.6424248\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE Spoken Language Technology Workshop (SLT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SLT.2012.6424248","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

摘要

在这项工作中，我们提出了一套技术，从同一语音的多个不同语言版本中探索信息，以提高自动语音识别(ASR)的性能。利用这些冗余信息，我们能够恢复首字母缩略词，在ASR系统产生的多个假设中找不到的单词，以及发音字典中没有的发音。当一起使用时，这三种技术比我们的基线系统的相对效率提高了5.0%，与标准语音识别相比，在具有三种不同语言(葡萄牙语、西班牙语和英语)的欧洲平行委员会数据集中，这三种技术的相对效率提高了24.8%。系统的一次完整迭代的并行实时因子(RTF)为3.08，顺序实时因子为6.44。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Recovery of acronyms, out-of-lattice words and pronunciations from parallel multilingual speech

In this work we present a set of techniques which explore information from multiple, different language versions of the same speech, to improve Automatic Speech Recognition (ASR) performance. Using this redundant information we are able to recover acronyms, words that cannot be found in the multiple hypotheses produced by the ASR systems, and pronunciations absent from their pronunciation dictionaries. When used together, the three techniques yield a relative improvement of 5.0% over the WER of our baseline system, and 24.8% relative when compared with standard speech recognition, in an Europarl Committee dataset with three different languages (Portuguese, Spanish and English). One full iteration of the system has a parallel Real Time Factor (RTF) of 3.08 and a sequential RTF of 6.44.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2012 IEEE Spoken Language Technology Workshop (SLT)

自引率

0.00%

发文量