2009 IEEE Workshop on Automatic Speech Recognition & Understanding最新文献

Detection of OOV words by combining acoustic confidence measures with linguistic features 声学置信度与语言特征相结合的OOV词检测

2009 IEEE Workshop on Automatic Speech Recognition & Understanding Pub Date : 2009-12-13 DOI: 10.1109/ASRU.2009.5372877

F. Stouten, D. Fohr, I. Illina

引用次数: 5

Active learning for rule-based and corpus-based Spoken Language Understanding models 基于规则和基于语料库的口语理解模型的主动学习

2009 IEEE Workshop on Automatic Speech Recognition & Understanding Pub Date : 2009-12-01 DOI: 10.1109/ASRU.2009.5373377

Pierre Gotab, Frédéric Béchet, Géraldine Damnati

{"title":"Active learning for rule-based and corpus-based Spoken Language Understanding models","authors":"Pierre Gotab, Frédéric Béchet, Géraldine Damnati","doi":"10.1109/ASRU.2009.5373377","DOIUrl":"https://doi.org/10.1109/ASRU.2009.5373377","url":null,"abstract":"Active learning can be used for the maintenance of a deployed Spoken Dialog System (SDS) that evolves with time and when large collection of dialog traces can be collected on a daily basis. At the Spoken Language Understanding (SLU) level this maintenance process is crucial as a deployed SDS evolves quickly when services are added, modified or dropped. Knowledge-based approaches, based on manually written grammars or inference rules, are often preferred as system designers can modify directly the SLU models in order to take into account such a modification in the service, even if no or very little related data has been collected. However as new examples are added to the annotated corpus, corpus-based methods can then be applied, replacing or in addition to the initial knowledge-based models. This paper describes an active learning scheme, based on an SLU criterion, which is used for automatically updating the SLU models of a deployed SDS. Two kind of SLU models are going to be compared: rule-based ones, used in the deployed system and consisting of several thousands of hand-crafted rules; corpus-based ones, based on the automatic learning of classifiers on an annotated corpus.","PeriodicalId":292194,"journal":{"name":"2009 IEEE Workshop on Automatic Speech Recognition & Understanding","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127522086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 11

Noise robust model adaptation using linear spline interpolation 基于线性样条插值的噪声鲁棒模型自适应

2009 IEEE Workshop on Automatic Speech Recognition & Understanding Pub Date : 2009-12-01 DOI: 10.1109/ASRU.2009.5373430

K. Kalgaonkar, M. Seltzer, A. Acero

引用次数: 8

Automatic detection of vowel pronunciation errors using multiple information sources 使用多个信息源自动检测元音发音错误

2009 IEEE Workshop on Automatic Speech Recognition & Understanding Pub Date : 2009-12-01 DOI: 10.1109/asru.2009.5373335

Joost van Doremalen, C. Cucchiarini, H. Strik

引用次数: 26

The Asian network-based speech-to-speech translation system 基于亚洲网络的语音到语音翻译系统

2009 IEEE Workshop on Automatic Speech Recognition & Understanding Pub Date : 2009-12-01 DOI: 10.1109/ASRU.2009.5373353

S. Sakti, Noriyuki Kimura, Michael Paul, Chiori Hori, E. Sumita, Satoshi Nakamura, Jun Park, C. Wutiwiwatchai, Bo Xu, Hammam Riza, K. Arora, C. Luong, Haizhou Li

{"title":"The Asian network-based speech-to-speech translation system","authors":"S. Sakti, Noriyuki Kimura, Michael Paul, Chiori Hori, E. Sumita, Satoshi Nakamura, Jun Park, C. Wutiwiwatchai, Bo Xu, Hammam Riza, K. Arora, C. Luong, Haizhou Li","doi":"10.1109/ASRU.2009.5373353","DOIUrl":"https://doi.org/10.1109/ASRU.2009.5373353","url":null,"abstract":"This paper outlines the first Asian network-based speech-to-speech translation system developed by the Asian Speech Translation Advanced Research (A-STAR) consortium. The system was designed to translate common spoken utterances of travel conversations from a certain source language into multiple target languages in order to facilitate multiparty travel conversations between people speaking different Asian languages. Each A-STAR member contributes one or more of the following spoken language technologies: automatic speech recognition, machine translation, and text-to-speech through Web servers. Currently, the system has successfully covered 9 languages— namely, 8 Asian languages (Hindi, Indonesian, Japanese, Korean, Malay, Thai, Vietnamese, Chinese) and additionally, the English language. The system's domain covers about 20,000 travel expressions, including proper nouns that are names of famous places or attractions in Asian countries. In this paper, we discuss the difficulties involved in connecting various different spoken language translation systems through Web servers. We also present speech-translation results on the first A-STAR demo experiments carried out in July 2009.","PeriodicalId":292194,"journal":{"name":"2009 IEEE Workshop on Automatic Speech Recognition & Understanding","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125316443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 15

Acoustic emotion recognition: A benchmark comparison of performances 声学情感识别:性能的基准比较

2009 IEEE Workshop on Automatic Speech Recognition & Understanding Pub Date : 2009-12-01 DOI: 10.1109/ASRU.2009.5372886

Björn Schuller, Bogdan Vlasenko, F. Eyben, G. Rigoll, A. Wendemuth

引用次数: 268

Speaker de-identification via voice transformation 通过语音转换去识别说话人

2009 IEEE Workshop on Automatic Speech Recognition & Understanding Pub Date : 2009-12-01 DOI: 10.1109/ASRU.2009.5373356

Qin Jin, Arthur R. Toth, Tanja Schultz, A. Black

{"title":"Speaker de-identification via voice transformation","authors":"Qin Jin, Arthur R. Toth, Tanja Schultz, A. Black","doi":"10.1109/ASRU.2009.5373356","DOIUrl":"https://doi.org/10.1109/ASRU.2009.5373356","url":null,"abstract":"It is a common feature of modern automated voice-driven applications and services to record and transmit a user's spoken request. At the same time, several domains and applications may require keeping the content of the user's request confidential and at the same time preserving the speaker's identity. This requires a technology that allows the speaker's voice to be de-identified in the sense that the voice sounds natural and intelligible but does not reveal the identity of the speaker. In this paper we investigate different voice transformation strategies on a large population of speakers to disguise the speakers' identities while preserving the intelligibility of the voices. We apply two automatic speaker identification approaches to verify the success of de-identification with voice transformation, a GMM-based and a Phonetic approach. The evaluation based on the automatic speaker identification systems verifies that the proposed voice transformation technique enables transmission of the content of the users' spoken requests while successfully preserving their identities. Also, the results indicate that different speakers still sound distinct after the transformation. Furthermore, we carried out a human listening test that proved the transformed speech to be both intelligible and securely de-identified, as it hid the identity of the speakers even to listeners who knew the speakers very well.","PeriodicalId":292194,"journal":{"name":"2009 IEEE Workshop on Automatic Speech Recognition & Understanding","volume":"88 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117128422","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 55

Sub-band modulation spectrum compensation for robust speech recognition 鲁棒语音识别的子带调制频谱补偿

2009 IEEE Workshop on Automatic Speech Recognition & Understanding Pub Date : 2009-12-01 DOI: 10.1109/ASRU.2009.5373506

Wen-hsiang Tu, Sheng-Yuan Huang, J. Hung

引用次数: 11

Diagonal priors for full covariance speech recognition 用于全协方差语音识别的对角线先验

2009 IEEE Workshop on Automatic Speech Recognition & Understanding Pub Date : 2009-12-01 DOI: 10.1109/ASRU.2009.5373344

P. Bell, Simon King

引用次数: 8

Optimal quantization and bit allocation for compressing large discriminative feature space transforms 压缩大型判别特征空间变换的最优量化和位分配

2009 IEEE Workshop on Automatic Speech Recognition & Understanding Pub Date : 2009-12-01 DOI: 10.1109/ASRU.2009.5373407

E. Marcheret, V. Goel, P. Olsen

引用次数: 5