2004 International Symposium on Chinese Spoken Language Processing最新文献_第6页

Text-independent speaker verification based on relation of MFCC components 基于MFCC成分关系的文本无关说话人验证

2004 International Symposium on Chinese Spoken Language Processing Pub Date : 2004-12-15 DOI: 10.1109/CHINSL.2004.1409585

G. Ou, Dengfeng Ke

引用次数: 9

Maximum entropy modeling for speech recognition 语音识别的最大熵建模

2004 International Symposium on Chinese Spoken Language Processing Pub Date : 2004-12-15 DOI: 10.1109/CHINSL.2004.1409569

H. Kuo

{"title":"Maximum entropy modeling for speech recognition","authors":"H. Kuo","doi":"10.1109/CHINSL.2004.1409569","DOIUrl":"https://doi.org/10.1109/CHINSL.2004.1409569","url":null,"abstract":"Summary form only given. Maximum entropy (maxent) models have become very popular in natural language processing. We begin with a basic introduction of the maximum entropy principle, cover the popular algorithms for training maxent models, and describe how maxent models have been used in language modeling and (more recently) acoustic modeling for speech recognition. Some comparisons with other discriminative modeling methods is made. A substantial amount of time is devoted to the details of a new framework for acoustic modeling using maximum entropy direct models, including practical issues of implementation and usage. Traditional statistical models for speech recognition have all been based on a Bayesian framework using generative models such as hidden Markov models (HMM). The new framework is based on maximum entropy direct modeling, where the probability of a state or word sequence given an observation sequence is computed directly from the model. In contrast to HMM, features can be asynchronous and overlapping, and need not be statistically independent. This model therefore allows for the potential combination of many different types of features. Results from a specific kind of direct model, the maximum entropy Markov model (MEMM) are presented. Even with conventional acoustic features, the approach already shows promising results for phone level decoding. The MEMM significantly outperforms traditional HMM in word error rate when used as stand-alone acoustic models. Combining the MEMM scores with HMM and language model scores shows modest improvements over the best HMM speech recognizer. We give a sense of some exciting possibilities for future research in using maximum entropy models for acoustic modeling.","PeriodicalId":212562,"journal":{"name":"2004 International Symposium on Chinese Spoken Language Processing","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117148030","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Perception of Mandarin intonation 普通话语调的感知

2004 International Symposium on Chinese Spoken Language Processing Pub Date : 2004-12-15 DOI: 10.1109/CHINSL.2004.1409582

Jiahong Yuan

引用次数: 25

Predicting prosodic words from lexical words - a first step towards predicting prosody from text 从词汇词中预测韵律词——从文本中预测韵律的第一步

2004 International Symposium on Chinese Spoken Language Processing Pub Date : 2004-12-15 DOI: 10.1109/CHINSL.2004.1409614

Hua-Jui Peng, Chi-ching Chen, Chiu-yu Tseng, Keh-Jiann Chen

引用次数: 18

Apply length distribution model to intonational phrase prediction 将长度分布模型应用于语调短语预测

2004 International Symposium on Chinese Spoken Language Processing Pub Date : 2004-12-15 DOI: 10.1109/CHINSL.2004.1409624

Jianfeng Li, Guoping Hu, Ming Fan, Lirong Dai

引用次数: 1

Cantonese verbal information verification system using GMM-based anti-model 广东话语音信息验证系统采用基于gmm的反模型

2004 International Symposium on Chinese Spoken Language Processing Pub Date : 2004-12-15 DOI: 10.1109/CHINSL.2004.1409645

Chao Qin, Tan Lee

引用次数: 2

An embedded English synthesis approach based on speech concatenation and smoothing 一种基于语音拼接和平滑的嵌入式英语合成方法

2004 International Symposium on Chinese Spoken Language Processing Pub Date : 2004-12-15 DOI: 10.1109/CHINSL.2004.1409610

Guilin Chen, Dongjian Yue, Yiqing Zu, Zhenli Yu

引用次数: 6

Prosody and style controls in CU VOCAL using SSML and SAPI XML tags 使用SSML和SAPI XML标记的cuvocal中的韵律和样式控件

2004 International Symposium on Chinese Spoken Language Processing Pub Date : 2004-12-15 DOI: 10.1109/CHINSL.2004.1409623

T. Fung, Yuk-Chi Li, H. Meng, P. Ching

引用次数: 1

A method of estimating the equal error rate for automatic speaker verification 自动说话人验证等错误率的估计方法

2004 International Symposium on Chinese Spoken Language Processing Pub Date : 2004-12-15 DOI: 10.1109/CHINSL.2004.1409642

Jyh-Min Cheng, Hsiao-Chuan Wang

引用次数: 45

Adaptive conditional pronunciation modeling using articulatory features for speaker verification 使用发音特征进行说话者验证的自适应条件发音建模

2004 International Symposium on Chinese Spoken Language Processing Pub Date : 2004-12-15 DOI: 10.1109/CHINSL.2004.1409586

Ka-Yee Leung, M. Mak, M. Siu, S. Kung

引用次数: 0