2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU)最新文献_第2页

Robust speech recognition by properly utilizing reliable frames and segments in corrupted signals 通过在损坏信号中适当地利用可靠的帧和段来实现鲁棒语音识别

2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU) Pub Date : 2007-12-01 DOI: 10.1109/ASRU.2007.4430091

Yi Chen, C. Wan, Lin-Shan Lee

引用次数: 1

Building a highly accurate Mandarin speech recognizer 构建高精度的普通话语音识别器

2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU) Pub Date : 2007-12-01 DOI: 10.1109/ASRU.2007.4430161

M. Hwang, Gang Peng, Wen Wang, Arlo Faria, A. Heidel, Mari Ostendorf

引用次数: 28

Minimum mutual information beamforming for simultaneous active speakers 同时有源说话者的最小互信息波束形成

2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU) Pub Date : 2007-12-01 DOI: 10.1109/ASRU.2007.4430086

K. Kumatani, U. Mayer, Tobias Gehrig, Emilian Stoimenov, J. McDonough, Matthias Wölfel

{"title":"Minimum mutual information beamforming for simultaneous active speakers","authors":"K. Kumatani, U. Mayer, Tobias Gehrig, Emilian Stoimenov, J. McDonough, Matthias Wölfel","doi":"10.1109/ASRU.2007.4430086","DOIUrl":"https://doi.org/10.1109/ASRU.2007.4430086","url":null,"abstract":"In this work, we address an acoustic beamforming application where two speakers are simultaneously active. We construct one subband domain beamformer in generalized sidelobe canceller (GSC) configuration for each source. In contrast to normal practice, we then jointly adjust the active weight vectors of both GSCs to obtain two output signals with minimum mutual information (MMI). In order to calculate the mutual information of the complex subband snapshots, we consider four probability density functions (pdfs), namely the Gaussian, Laplace, K0 and lceil pdfs. The latter three belong to the class of super-Gaussian density functions that are typically used in independent component analysis as opposed to conventional beam-forming. We demonstrate the effectiveness of our proposed technique through a series of far-field automatic speech recognition experiments on data from the PASCAL Speech Separation Challenge. In the experiments, the delay-and-sum beamformer achieved a word error rate (WER) of 70.4 %. The MMI beamformer under a Gaussian assumption achieved 55.2 % WER which was further reduced to 52.0 % with a K0 pdf, whereas the WER for data recorded with close-talking microphone was 21.6 %.","PeriodicalId":371729,"journal":{"name":"2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129399289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Development of the 2007 RWTH Mandarin LVCSR system 2007年工业大学文华LVCSR系统的发展

2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU) Pub Date : 2007-12-01 DOI: 10.1109/ASRU.2007.4430155

Björn Hoffmeister, Christian Plahl, P. Fritz, G. Heigold, J. Lööf, R. Schlüter, H. Ney

引用次数: 23

Adapting grapheme-to-phoneme conversion for name recognition 适应字素到音素的名称识别转换

2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU) Pub Date : 2007-12-01 DOI: 10.1109/ASRU.2007.4430097

Xiao Li, A. Gunawardana, A. Acero

引用次数: 25

Interpolative variable frame rate transmission of speech features for distributed speech recognition 分布式语音识别中语音特征的插值变帧率传输

2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU) Pub Date : 2007-12-01 DOI: 10.1109/ASRU.2007.4430179

Huiqun Deng, D. O'Shaughnessy, Jean-Guy Dahan, W. Ganong

引用次数: 6

Recognition and understanding of meetings the AMI and AMIDA projects 对AMI和AMIDA项目会议的认识和理解

2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU) Pub Date : 2007-12-01 DOI: 10.1109/ASRU.2007.4430116

S. Renals, Thomas Hain, H. Bourlard

引用次数: 144

Voice/audio information retrieval: minimizing the need for human ears 语音/音频信息检索:尽量减少对人耳的需求

2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU) Pub Date : 2007-12-01 DOI: 10.1109/ASRU.2007.4430183

M. Clements, M. Gavaldà

引用次数: 3

Unsupervised state clustering for stochastic dialog management 随机对话管理的无监督状态聚类

2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU) Pub Date : 2007-12-01 DOI: 10.1109/ASRU.2007.4430171

F. Lefèvre, R. Mori

{"title":"Unsupervised state clustering for stochastic dialog management","authors":"F. Lefèvre, R. Mori","doi":"10.1109/ASRU.2007.4430171","DOIUrl":"https://doi.org/10.1109/ASRU.2007.4430171","url":null,"abstract":"Following recent studies in stochastic dialog management, this paper introduces an unsupervised approach aiming at reducing the cost and complexity for the setup of a probabilistic POMDP-based dialog manager. The proposed method is based on a first decoding step deriving semantic basic constituents from user utterances. These isolated units and some relevant context features (as previous system actions, previous user utterances...) are combined to form vectors representing the on-going dialog states. After a clustering step, each partition of this space is intented to represent a particular dialog state. Then any new utterance can be classified according to these automatic states and the belief state can be updated before the POMDP-based dialog manager can take a decision on the best next action to perform. The proposed approach is applied to the French media task (tourist information and hotel booking). The media 10k-utterance training corpus is semantically rich (over 80 basic concepts) and is segmentally annotated in terms of basic concepts. Before user trials can be carried out, some insights on the method effectiveness are obtained by analysis of the convergence of the POMDP models.","PeriodicalId":371729,"journal":{"name":"2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116309298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

Speech enhancement using PCA and variance of the reconstruction error in distributed speech recognition 分布式语音识别中基于PCA和重构误差方差的语音增强

2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU) Pub Date : 2007-12-01 DOI: 10.1109/ASRU.2007.4430077

Amin Haji Abolhassani, S. Selouani, D. O'Shaughnessy

引用次数: 13