IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01.最新文献

筛选
英文 中文
Directory assistance: learning user formulations for business listings 目录协助:学习企业列表的用户公式
IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01. Pub Date : 2001-12-09 DOI: 10.1109/ASRU.2001.1034634
C. Popovici, M. Andorno, P. Laface, L. Fissore, M. Nigra, C. Vair
{"title":"Directory assistance: learning user formulations for business listings","authors":"C. Popovici, M. Andorno, P. Laface, L. Fissore, M. Nigra, C. Vair","doi":"10.1109/ASRU.2001.1034634","DOIUrl":"https://doi.org/10.1109/ASRU.2001.1034634","url":null,"abstract":"One of the main problems in automatic directory assistance (DA) for business listings is that customers formulate their requests for the same listing with a great variability. We show that an automatic approach allows the detection, from field data, of user formulations that were not foreseen by the designers, and that they can be added, as variants, to the denominations already included in the system to reduce its failures.","PeriodicalId":118671,"journal":{"name":"IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01.","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116802537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Evaluating long-term spectral subtraction for reverberant ASR 评估混响ASR的长期频谱减法
IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01. Pub Date : 2001-12-09 DOI: 10.1109/ASRU.2001.1034598
David Gelbart, Nelson Morgan
{"title":"Evaluating long-term spectral subtraction for reverberant ASR","authors":"David Gelbart, Nelson Morgan","doi":"10.1109/ASRU.2001.1034598","DOIUrl":"https://doi.org/10.1109/ASRU.2001.1034598","url":null,"abstract":"Even a modest degree of room reverberation can greatly increase the difficulty of automatic speech recognition. We have observed large increases in speech recognition word error rates when using a far-field (3-6 feet) microphone in a conference room, in comparison with recordings from head-mounted microphones. In this paper, we describe experiments with a proposed remedy based on the subtraction of an estimate of the log spectrum from a long-term (e.g., 2 s) analysis window, followed by overlap-add resynthesis. Since the technique is essentially one of enhancement, the processed signal it generates can be used as input for complete speech recognition systems. Here we report results with both the HTK and the SRI Hub-5 recognizer. For simpler recognizer configurations and/or moderate-sized training, the improvements are huge, while moderate improvements are still observed for more complex configurations under a number of conditions.","PeriodicalId":118671,"journal":{"name":"IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01.","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117213225","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 46
n-gram and decision tree based language identification for written words 基于N-gram和决策树的文字识别
IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01. Pub Date : 2001-12-09 DOI: 10.1109/ASRU.2001.1034655
J. Hakkinen, Jilei Tian
{"title":"n-gram and decision tree based language identification for written words","authors":"J. Hakkinen, Jilei Tian","doi":"10.1109/ASRU.2001.1034655","DOIUrl":"https://doi.org/10.1109/ASRU.2001.1034655","url":null,"abstract":"As the demand for multilingual speech recognizers increases, the development of systems which combine automatic language identification, language-specific pronunciation modeling and language-independent acoustic models becomes increasingly important. When the recognition grammar is dynamic and obtained directly from written text, the language associated with each grammar item has to be identified using that text. Many methods proposed in the literature require fairly large amounts of text, which may not always be available. This paper describes a text-based language identification system developed for the identification of the language of short words, e.g., proper names. Two different approaches are compared. The n-gram method commonly used in the literature is first reviewed and further enhanced. We also propose a simple method for language identification that is based on decision trees. The methods are first evaluated in a text-based language identification task. Both methods are also tested as preprocessors for a multilingual speech recognition task, where the language of each text item has to be determined, in order to choose the correct text-to-pronunciation mapping. The experimental results show that the proposed methods perform very well, and merit further development.","PeriodicalId":118671,"journal":{"name":"IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01.","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124946275","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 54
The ALERT system: advanced broadcast speech recognition technology for selective dissemination of multimedia information ALERT系统:先进的广播语音识别技术,用于选择性地传播多媒体信息
IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01. Pub Date : 2001-12-09 DOI: 10.1109/ASRU.2001.1034647
G. Rigoll
{"title":"The ALERT system: advanced broadcast speech recognition technology for selective dissemination of multimedia information","authors":"G. Rigoll","doi":"10.1109/ASRU.2001.1034647","DOIUrl":"https://doi.org/10.1109/ASRU.2001.1034647","url":null,"abstract":"This paper presents a brief description of the ALERT system, which is under development by a consortium working on a research project sponsored by the European Commission. The ALERT system uses advanced speech recognition technology and video processing techniques in order to process large broadcast speech archives and multimedia information resources for the purpose of extracting specific information from such databases and inform selected customers about its contents. It is one of the most ambitious projects currently carried out in the human language technologies (HLT) area (see also http://alert.uni-duisburg.de). The paper describes the objectives of the overall system, its basic system architecture and the scientific approach taken in order to realize the specified demonstrators.","PeriodicalId":118671,"journal":{"name":"IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01.","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114565900","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
CORBA-based speech-to-speech translation system 基于corba的语音到语音翻译系统
IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01. Pub Date : 2001-12-09 DOI: 10.1109/ASRU.2001.1034660
R. Gruhn, K. Takashima, A. Nishino, S. Nakamura
{"title":"CORBA-based speech-to-speech translation system","authors":"R. Gruhn, K. Takashima, A. Nishino, S. Nakamura","doi":"10.1109/ASRU.2001.1034660","DOIUrl":"https://doi.org/10.1109/ASRU.2001.1034660","url":null,"abstract":"We describe the new implementation of a speech-to-speech translation system at ATR Spoken Language Translation Research Laboratories (SLT). We use the architecture standard CORBA (Common Object Request Broker Architecture) to interface between a speech recognizer, translation system and TTS engine. Various input types are supported, including close-talking microphone and telephony hardware.","PeriodicalId":118671,"journal":{"name":"IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129177184","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Improved MFCC feature extraction by PCA-optimized filter-bank for speech recognition 基于pca优化滤波器组的语音识别改进MFCC特征提取
IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01. Pub Date : 2001-12-09 DOI: 10.1109/ASRU.2001.1034586
Shang-Ming Lee, Shih-Hau Fang, J. Hung, Lin-Shan Lee
{"title":"Improved MFCC feature extraction by PCA-optimized filter-bank for speech recognition","authors":"Shang-Ming Lee, Shih-Hau Fang, J. Hung, Lin-Shan Lee","doi":"10.1109/ASRU.2001.1034586","DOIUrl":"https://doi.org/10.1109/ASRU.2001.1034586","url":null,"abstract":"Although Mel-frequency cepstral coefficients (MFCC) have been proven to perform very well under most conditions, some limited efforts have been made in optimizing the shape of the filters in the filter-bank in the conventional MFCC approach. This paper presents a new feature extraction approach that designs the shapes of the filters in the filter-bank. In this new approach, the filter-bank coefficients are data-driven and obtained by applying principal component analysis (PCA) to the FFT spectrum of the training data. The experimental results show that this method is robust under noisy environment and is well additive with other noise-handling techniques.","PeriodicalId":118671,"journal":{"name":"IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01.","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123726179","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 38
Language models beyond word strings 超越字串的语言模型
IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01. Pub Date : 2001-12-09 DOI: 10.1109/ASRU.2001.1034614
E. Noth, A. Batliner, H. Niemann, G. Stemmer, F. Gallwitz, J. Spilker
{"title":"Language models beyond word strings","authors":"E. Noth, A. Batliner, H. Niemann, G. Stemmer, F. Gallwitz, J. Spilker","doi":"10.1109/ASRU.2001.1034614","DOIUrl":"https://doi.org/10.1109/ASRU.2001.1034614","url":null,"abstract":"In this paper we want to show how n-gram language models can be used to provide additional information in automatic speech understanding systems beyond the pure word chain. This becomes important in the context of conversational dialogue systems that have to recognize and interpret spontaneous speech. We show how n-grams can: (1) help to classify prosodic events like boundaries and accents; (2) be extended to directly provide boundary information in the speech recognition phase; (3) help to process speech repairs; and (4) detect and semantically classify out-of-vocabulary words. The approaches can work on the best word chain or a word hypotheses graph. Examples and experimental results are provided from our own research within the EVAR information retrieval system and the VERBMOBIL speech-to-speech translation system.","PeriodicalId":118671,"journal":{"name":"IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01.","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116483795","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Dialogue management in the Talk'n'Travel system Talk'n'Travel系统中的对话管理
IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01. Pub Date : 2001-12-09 DOI: 10.1109/ASRU.2001.1034631
D. Stallard
{"title":"Dialogue management in the Talk'n'Travel system","authors":"D. Stallard","doi":"10.1109/ASRU.2001.1034631","DOIUrl":"https://doi.org/10.1109/ASRU.2001.1034631","url":null,"abstract":"A central problem for mixed-initiative dialogue management is coping with user utterances that fall outside of the expected sequence of dialogue. Independent initiative by the user may require a complete revision of the future course of the dialogue, even when the system is engaged in activities of its own, such as querying a database, etc. This paper presents an event-driven, goal-based dialogue manager component we have developed to cope with these challenges. The dialog manager is explicitly designed for asynchronous input and flexible control, and uses a tree-ordered rule language we have developed that also provides for close coupling with discourse processing. The dialogue manager is implemented as part of Talk'n'Travel, a simulated air travel reservation dialogue system we have developed under the US DARPA Communicator dialogue research program, whose purpose and scope we also briefly summarize.","PeriodicalId":118671,"journal":{"name":"IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01.","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114688771","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
The statistical approach to spoken language translation 口语翻译的统计方法
IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01. Pub Date : 2001-12-09 DOI: 10.1109/ASRU.2001.1034663
H. Ney
{"title":"The statistical approach to spoken language translation","authors":"H. Ney","doi":"10.1109/ASRU.2001.1034663","DOIUrl":"https://doi.org/10.1109/ASRU.2001.1034663","url":null,"abstract":"This paper gives an overview of our work on statistical machine translation of spoken dialogues, in particular in the framework of the VERBMOBIL project. The goal of the VERBMOBIL project is the translation of spoken dialogues in the domains of appointment scheduling and travel planning. Starting with the Bayes decision rule as in speech recognition; we show how the required probability distributions can be structured into three parts: the language model, the alignment model and the lexicon model. We describe the components of the system and report results on the VERBMOBIL task. The experience obtained in the VERBMOBIL project, in particular a largescale end-to-end evaluation, showed that the statistical approach resulted in significantly lower error rates than three competing translation approaches: the sentence error rate was 29% in comparison with 52% to 62% for the other translation approaches. Finally, we discuss the integrated approach to speech translation as opposed to the serial approach that is widely used nowadays.","PeriodicalId":118671,"journal":{"name":"IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01.","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126359279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Computing consensus translation from multiple machine translation systems 从多个机器翻译系统计算共识翻译
IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01. Pub Date : 2001-12-09 DOI: 10.1109/ASRU.2001.1034659
B. Bangalore, Germán Bordel, G. Riccardi
{"title":"Computing consensus translation from multiple machine translation systems","authors":"B. Bangalore, Germán Bordel, G. Riccardi","doi":"10.1109/ASRU.2001.1034659","DOIUrl":"https://doi.org/10.1109/ASRU.2001.1034659","url":null,"abstract":"We address the problem of computing a consensus translation given the outputs from a set of machine translation (MT) systems. The translations from the MT systems are aligned with a multiple string alignment algorithm and the consensus translation is then computed. We describe the multiple string alignment algorithm and the consensus MT hypothesis computation. We report on the subjective and objective performance of the multilingual acquisition approach on a limited domain spoken language application. We evaluate five domain-independent off-the-shelf MT systems and show that the consensus-based translation performance is equal to or better than any of the given MT systems, in terms of both objective and subjective measures.","PeriodicalId":118671,"journal":{"name":"IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01.","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126434804","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 177
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信