IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01.最新文献

筛选
英文 中文
Language modeling for multi-domain speech-driven text retrieval 多领域语音驱动文本检索的语言建模
IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01. Pub Date : 2001-12-09 DOI: 10.1109/ASRU.2001.1034653
K. Itou, Atsushi Fujii, Tetsuya Ishikawa
{"title":"Language modeling for multi-domain speech-driven text retrieval","authors":"K. Itou, Atsushi Fujii, Tetsuya Ishikawa","doi":"10.1109/ASRU.2001.1034653","DOIUrl":"https://doi.org/10.1109/ASRU.2001.1034653","url":null,"abstract":"We report experimental results associated with speech-driven text retrieval, which facilitates retrieving information in multiple domains with spoken queries. Since users speak contents related to a target collection, we produce language models used for speech recognition based on the target collection, so as to improve both the recognition and retrieval accuracy. Experiments using existing test collections combined with dictated queries showed the effectiveness of our method.","PeriodicalId":118671,"journal":{"name":"IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130231520","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Very large vocabulary proper name recognition for directory assistance 非常大的词汇,适当的名称识别目录协助
IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01. Pub Date : 2001-12-09 DOI: 10.1109/ASRU.2001.1034627
F. Béchet, R. de Mori, G. Subsol
{"title":"Very large vocabulary proper name recognition for directory assistance","authors":"F. Béchet, R. de Mori, G. Subsol","doi":"10.1109/ASRU.2001.1034627","DOIUrl":"https://doi.org/10.1109/ASRU.2001.1034627","url":null,"abstract":"This paper deals with the difficult task of recognition of a large vocabulary of proper names in a directory assistance application. After a presentation of the related work, it introduces a methodology for rescoring the N-best hypotheses generated by a first step recognition. First experiments give encouraging results and several topics for future research are presented.","PeriodicalId":118671,"journal":{"name":"IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01.","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116832540","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Investigating stochastic speech understanding 随机语音理解研究
IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01. Pub Date : 2001-12-09 DOI: 10.1109/ASRU.2001.1034637
H. Bonneau-Maynard, F. Lefèvre
{"title":"Investigating stochastic speech understanding","authors":"H. Bonneau-Maynard, F. Lefèvre","doi":"10.1109/ASRU.2001.1034637","DOIUrl":"https://doi.org/10.1109/ASRU.2001.1034637","url":null,"abstract":"The need for human expertise in the development of a speech understanding system can be greatly reduced by the use of stochastic techniques. However corpus-based techniques require the annotation of large amounts of training data. Manual semantic annotation of such corpora is tedious, expensive, and subject to inconsistencies. This work investigates the influence of the training corpus size on the performance of the understanding module. The use of automatically annotated data is also investigated as a means to increase the corpus size at a very low cost. First, a stochastic speech understanding model developed using data collected with the LIMSI ARISE dialog system is presented. Its performance is shown to be comparable to that of the rule-based caseframe grammar currently used in the system. In a second step, two ways of reducing the development cost are pursued: (1) reducing of the amount of manually annotated data used to train the stochastic models and (2) using automatically annotated data in the training process.","PeriodicalId":118671,"journal":{"name":"IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01.","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114767318","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Speaker-trained recognition using allophonic enrollment models 使用语音注册模型的说话人训练识别
IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01. Pub Date : 2001-12-09 DOI: 10.1109/ASRU.2001.1034589
V. Yanhoucke, M. Hochberg, C. Leggetter
{"title":"Speaker-trained recognition using allophonic enrollment models","authors":"V. Yanhoucke, M. Hochberg, C. Leggetter","doi":"10.1109/ASRU.2001.1034589","DOIUrl":"https://doi.org/10.1109/ASRU.2001.1034589","url":null,"abstract":"We introduce a method for performing speaker-trained recognition based on context-dependent allophone models from a large-vocabulary, speaker-independent recognition system. A set of speaker-enrollment templates is selected from the context-dependent allophone models. These templates are used to build representations of the speaker-enrolled utterances. The advantages of this approach include improved performance and portability of the enrollments across different acoustic models. We describe the approach used to select the enrollment templates and how to apply them to speaker-trained recognition. The approach has been evaluated on an over-the-telephone, voice-activated dialing task and shows significant performance improvements over techniques based on context-independent phone models or general acoustic model templates. In addition, the portability of enrollments from one model set to another is shown to result in almost no performance degradation.","PeriodicalId":118671,"journal":{"name":"IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128736530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
High performance telephone bandwidth speaker independent continuous digit recognition 高性能电话带宽扬声器独立连续数字识别
IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01. Pub Date : 2001-12-09 DOI: 10.1109/ASRU.2001.1034670
P. Cosi, J.-P. Hosoma, A. Valente
{"title":"High performance telephone bandwidth speaker independent continuous digit recognition","authors":"P. Cosi, J.-P. Hosoma, A. Valente","doi":"10.1109/ASRU.2001.1034670","DOIUrl":"https://doi.org/10.1109/ASRU.2001.1034670","url":null,"abstract":"The development of a high-performance telephone-bandwidth speaker independent connected digit recognizer for Italian is described. The CSLU Speech Toolkit was used to develop and implement the hybrid ANN/HMM system, which is trained on context-dependent categories to account for coarticulatory variation. Various front-end processing and system architectures were compared and, when the best features (MFCC with CMS + /spl Delta/) and network (4-layer fully connected feed-forward network) were considered, there was a 98.92% word recognition accuracy and a 92.62% sentence recognition accuracy on a test set of the FIELD continuous digits recognition task.","PeriodicalId":118671,"journal":{"name":"IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01.","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116596695","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Pseudo 2-dimensional hidden Markov models in speech recognition 语音识别中的伪二维隐马尔可夫模型
IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01. Pub Date : 2001-12-09 DOI: 10.1109/ASRU.2001.1034679
S. Werner, G. Rigoll
{"title":"Pseudo 2-dimensional hidden Markov models in speech recognition","authors":"S. Werner, G. Rigoll","doi":"10.1109/ASRU.2001.1034679","DOIUrl":"https://doi.org/10.1109/ASRU.2001.1034679","url":null,"abstract":"In this paper, the usage of pseudo 2-dimensional hidden Markov models for speech recognition is discussed. This image processing method should better model the time-frequency structure in speech signals. The method calculates the emission probability of a standard HMM by embedded HMM for each state. If a temporal sequence of spectral vectors is imagined as a spectrogram, this leads to a 2-dimensional warping of the spectrogram. This additional warping of the frequency axis could be useful for speaker-independent recognition and can be considered to be similar to a vocal tract normalization. The effects of this paradigm are investigated in this paper using the TI-Digits database.","PeriodicalId":118671,"journal":{"name":"IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01.","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127667310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Task-specific adaptation of speech recognition models 特定任务的语音识别模型自适应
IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01. Pub Date : 2001-12-09 DOI: 10.1109/ASRU.2001.1034677
A. Sankar, Ashvin Kannan, B. Shahshahani, E. Jackson
{"title":"Task-specific adaptation of speech recognition models","authors":"A. Sankar, Ashvin Kannan, B. Shahshahani, E. Jackson","doi":"10.1109/ASRU.2001.1034677","DOIUrl":"https://doi.org/10.1109/ASRU.2001.1034677","url":null,"abstract":"Most published adaptation research focuses on speaker adaptation, and on adaptation for noisy channels and background environments. We study acoustic, grammar, and combined acoustic and grammar adaptation for creating task-specific recognition models. Comprehensive experimental results are presented using data from natural language quotes and a trading application. The results show that task adaptation gives substantial improvements in both utterance understanding accuracy, and recognition speed.","PeriodicalId":118671,"journal":{"name":"IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01.","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117051335","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
An online model adaptation method for compensating speech models for noise in continuous speech recognition 连续语音识别中语音模型噪声补偿的在线模型自适应方法
IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01. Pub Date : 2001-12-09 DOI: 10.1109/ASRU.2001.1034609
R. Lee, E. Choi
{"title":"An online model adaptation method for compensating speech models for noise in continuous speech recognition","authors":"R. Lee, E. Choi","doi":"10.1109/ASRU.2001.1034609","DOIUrl":"https://doi.org/10.1109/ASRU.2001.1034609","url":null,"abstract":"This paper presents a method for online model adaptation based on the parallel model combination (PMC) method. The proposed method makes use of the concept of Gaussian model clustering to reduce the computation load required by PMC. This model clustering, in combination with a set of derived transformation equations, provide a potential framework for online model adaptation in noisy speech recognition. The proposed method reduces the computation in adaptation by about 45% with only a slight degradation in improvements of an average 18% for a connected digit task and 9% for a large vocabulary Mandarin task when compared with standard PMC method.","PeriodicalId":118671,"journal":{"name":"IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01.","volume":"192 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117113175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An open concept metric for assessing dialog system complexity 用于评估对话系统复杂性的开放概念度量
IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01. Pub Date : 2001-12-09 DOI: 10.1109/ASRU.2001.1034638
T. M. DuBois, Alexander I. Rudnicky
{"title":"An open concept metric for assessing dialog system complexity","authors":"T. M. DuBois, Alexander I. Rudnicky","doi":"10.1109/ASRU.2001.1034638","DOIUrl":"https://doi.org/10.1109/ASRU.2001.1034638","url":null,"abstract":"Techniques for assessing dialog system performance commonly focus on characteristics of the interaction, using metrics such as completion, satisfaction or time on task. However, such metrics are not always capable of differentiating systems that operate on fundamentally different principles, particularly when tested on tasks that focus on common-denominator capabilities. We introduce a new metric, the open concept count, and show how it can be used to capture useful system properties of a dialog system.","PeriodicalId":118671,"journal":{"name":"IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01.","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130666658","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Markovian combination of language and prosodic models for better speech understanding and recognition 语言和韵律模型的马尔可夫组合,以更好地理解和识别语音
IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01. Pub Date : 2001-12-09 DOI: 10.1109/ASRU.2001.1034615
A. Stolcke, Elizabeth Shriberg
{"title":"Markovian combination of language and prosodic models for better speech understanding and recognition","authors":"A. Stolcke, Elizabeth Shriberg","doi":"10.1109/ASRU.2001.1034615","DOIUrl":"https://doi.org/10.1109/ASRU.2001.1034615","url":null,"abstract":"Summary form only given. Traditionally, \"language\" models capture only the word sequences of a language. A crucial component of spoken language, however is its prosody, i.e., rhythmic and melodic properties. This paper summarizes recent work on integrated, computationally efficient modeling of word sequences and prosodic properties of speech, for a variety of speech recognition and understanding tasks, such as dialog act tagging, disfluency detection, and segmentation into sentences and topics. In each case it turns out that hidden Markov representations of the underlying structures and associated observations arise naturally, and allow existing speech recognizers to be combined with separately trained prosodic classifiers. The same HMM-based models can be used in two modes: to recover hidden structure (such as sentence boundaries), or to evaluate speech recognition hypotheses, thereby integrating prosody into the recognition process.","PeriodicalId":118671,"journal":{"name":"IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01.","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115924796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信