2008 6th International Symposium on Chinese Spoken Language Processing最新文献

筛选
英文 中文
Pronunciation Space Models for Pronunciation Evaluation 语音评价的语音空间模型
2008 6th International Symposium on Chinese Spoken Language Processing Pub Date : 2008-12-30 DOI: 10.1109/CHINSL.2008.ECP.17
Si Wei, Yi-Qian Pan, Guoping Hu, Yu Hu, Ren-Hua Wang
{"title":"Pronunciation Space Models for Pronunciation Evaluation","authors":"Si Wei, Yi-Qian Pan, Guoping Hu, Yu Hu, Ren-Hua Wang","doi":"10.1109/CHINSL.2008.ECP.17","DOIUrl":"https://doi.org/10.1109/CHINSL.2008.ECP.17","url":null,"abstract":"Posterior probability is mostly used for pronunciation evaluation. This paper introduces pronunciation space models to calculate posterior probability replacing traditional phone-based acoustic models, which makes the calculated posterior probability more precise. Pronunciation space models are constructed using unsupervised clustering method guided by human scores and phone-level posterior probability. By using correlation between machine scores and human scores as the performance measurement, pronunciation space models based method shows its effectiveness for pronunciation evaluation in the experiments on a Chinese database spoken by Koreans with the correlation's improvement from 0.390 to 0.415 comparing to the traditional method based on phone based acoustic models.","PeriodicalId":291958,"journal":{"name":"2008 6th International Symposium on Chinese Spoken Language Processing","volume":"421 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115856216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Word Alignment Based on Multi-Grain Model 基于多粒度模型的词对齐
2008 6th International Symposium on Chinese Spoken Language Processing Pub Date : 2008-12-30 DOI: 10.1109/CHINSL.2008.ECP.79
Yanqing He, Yu Zhou, Chengqing Zong
{"title":"Word Alignment Based on Multi-Grain Model","authors":"Yanqing He, Yu Zhou, Chengqing Zong","doi":"10.1109/CHINSL.2008.ECP.79","DOIUrl":"https://doi.org/10.1109/CHINSL.2008.ECP.79","url":null,"abstract":"Word alignment plays a critical role in statistical machine translation (SMT) and cross-language information retrieval. Until now, most existing methods get the word alignment within the whole range of the sentence length. The alignment quality is unsatisfactory. In this paper, we propose a novel approach to word alignment based on multi-grain model (WAMG). We split a parallel sentence pair into blocks in different grain and get the word alignments within each corresponding block. Our approach is able to restrict the search space of word alignment in the relatively accurate local range and reduce the mapping error. The experiments have shown that our approach outperforms the traditional word alignment algorithm relatively by about 12% in AER and improves the performance of Chinese-to-English translation system relatively by about 2.8% in BLEU.","PeriodicalId":291958,"journal":{"name":"2008 6th International Symposium on Chinese Spoken Language Processing","volume":"29 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117044616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Frequency Modulation Technique for Prosodic Modification 调频技术的韵律修改
2008 6th International Symposium on Chinese Spoken Language Processing Pub Date : 2008-12-30 DOI: 10.1109/CHINSL.2008.ECP.41
Jinfu Ni, S. Sakai, Tohru Shimizu, Satoshi Nakamura
{"title":"Frequency Modulation Technique for Prosodic Modification","authors":"Jinfu Ni, S. Sakai, Tohru Shimizu, Satoshi Nakamura","doi":"10.1109/CHINSL.2008.ECP.41","DOIUrl":"https://doi.org/10.1109/CHINSL.2008.ECP.41","url":null,"abstract":"Modulation of speaking tone in frequency can make speech interesting and convey subtle meaning in communication. We present a frequency modulation (FM) technique for prosodic modification to consider communicative speech synthesis. This technique provides a mathematical formulation for representing speaking tone and manipulating FM in a unified framework. Two experiments are conducted with a text-to-speech system to which a module of FM-based prosodic modification is added. One is to enhance emphasis in words when synthesizing Chinese conversational speech. The other is to modify reading- style prosody while conveying good and bad news in Japanese; this is done by using the FM technique to shift the frequency ranges and rescale the fundamental frequency contours jointly. The experimental results indicated that the native speakers identified 90% of samples with emphases and 78% of \"good news\" as well as 94% of \"bad news\" samples. The FM technique is vital for making synthetic speech communicative.","PeriodicalId":291958,"journal":{"name":"2008 6th International Symposium on Chinese Spoken Language Processing","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115128441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
PLSA Based Topic Mixture Language Modeling Approach 基于PLSA的主题混合语言建模方法
2008 6th International Symposium on Chinese Spoken Language Processing Pub Date : 2008-12-30 DOI: 10.1109/CHINSL.2008.ECP.58
Shuanhu Bai, Haizhou Li
{"title":"PLSA Based Topic Mixture Language Modeling Approach","authors":"Shuanhu Bai, Haizhou Li","doi":"10.1109/CHINSL.2008.ECP.58","DOIUrl":"https://doi.org/10.1109/CHINSL.2008.ECP.58","url":null,"abstract":"In this paper, we propose a method to extend the use of latent topics into higher order n-gram models. In training, the parameters of higher order n-gram models are estimated using discounted average counts derived from the application of probabilistic latent semantic analysis(PLSA) models on n-gram counts in training corpus. In decoding, a simple yet efficient topic prediction method is introduced to predict its topic given a new document. The proposed topic mixture language model (TMLM) displays two advantages over previous methods: 1) having the ability of building topic mixture n-gram LM (n>1) and, 2) without requiring a big general baseline LM. The experimental results show that TMLMs, even using smaller number of topics, outperform LMs implemented using both standard n-gram approach and unsupervised adaptation approaches in terms of perplexity reductions.","PeriodicalId":291958,"journal":{"name":"2008 6th International Symposium on Chinese Spoken Language Processing","volume":"60 1-2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126953297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Use of Dynamic Deformable Templates for Lip Tracking in an Audio-Visual Corpus with Large Variations in Head Pose, Face Illumination and Lip Shapes 在头部姿势、面部光照和唇形变化较大的视听语料库中使用动态可变形模板进行唇形跟踪
2008 6th International Symposium on Chinese Spoken Language Processing Pub Date : 2008-12-30 DOI: 10.1109/CHINSL.2008.ECP.104
Zhiyong Wu, Jiying Wu, H. Meng
{"title":"The Use of Dynamic Deformable Templates for Lip Tracking in an Audio-Visual Corpus with Large Variations in Head Pose, Face Illumination and Lip Shapes","authors":"Zhiyong Wu, Jiying Wu, H. Meng","doi":"10.1109/CHINSL.2008.ECP.104","DOIUrl":"https://doi.org/10.1109/CHINSL.2008.ECP.104","url":null,"abstract":"This paper describes an approach for lip tracking using dynamic deformable templates. The objective is to track lip parameters from an audio-visual corpus recording a voice talent who is reading text prompts in a natural and expressive way. The corpus presents challenges to the conventional method of lip tracking with deformable templates. This is because natural and expressive speech includes relatively large motions of the head and the lips. The head motions lead to changes in the illumination of the face region and changes in the observed lip shape. In addition, emphatic pronunciations lead to large changes in the lip shape. Video frames that are affected by face illumination changes present additional difficulty in locating the mouth region (i.e. region of interest, ROI). Video frames that are affected by changes in lip shapes present additional deviations from the lip templates and hence lower tracking accuracies. Our proposed method incorporates \"dynamicity\" in the deformable templates to render them adaptive to changes in head pose, face illumination and lip shapes. Experiments show that dynamic deformable templates consistently outperform the conventional deformable templates in lip tracking.","PeriodicalId":291958,"journal":{"name":"2008 6th International Symposium on Chinese Spoken Language Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129156190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A Combined Task Analysis Method for Data Selection in Mandarin Isolated Word Recognition System 汉语孤立词识别系统数据选择的组合任务分析方法
2008 6th International Symposium on Chinese Spoken Language Processing Pub Date : 2008-12-30 DOI: 10.1109/CHINSL.2008.ECP.65
Z. He, Z. Wang, W. Li, J. Wu
{"title":"A Combined Task Analysis Method for Data Selection in Mandarin Isolated Word Recognition System","authors":"Z. He, Z. Wang, W. Li, J. Wu","doi":"10.1109/CHINSL.2008.ECP.65","DOIUrl":"https://doi.org/10.1109/CHINSL.2008.ECP.65","url":null,"abstract":"This paper studies the performance of the data selection with a combined task analysis method in task adaptation on Mandarin isolated word recognition. The proposed task analysis method combines coverage unit balanced task analysis with the confusability based analysis. The performance is evaluated with several experiments.","PeriodicalId":291958,"journal":{"name":"2008 6th International Symposium on Chinese Spoken Language Processing","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128103173","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Pitch Tracking for Model-Based Speech Separation 基于模型的语音分离的基音跟踪
2008 6th International Symposium on Chinese Spoken Language Processing Pub Date : 2008-12-30 DOI: 10.1109/CHINSL.2008.ECP.48
Siu Wa Lee, F. Soong, P. Ching, Tan Lee
{"title":"Pitch Tracking for Model-Based Speech Separation","authors":"Siu Wa Lee, F. Soong, P. Ching, Tan Lee","doi":"10.1109/CHINSL.2008.ECP.48","DOIUrl":"https://doi.org/10.1109/CHINSL.2008.ECP.48","url":null,"abstract":"Estimating multiple pitch frequencies of concurrent speech sources from a single-microphone input is essential to speech separation. Nevertheless, pitch cues of individual sources are weakened by each other, making the estimation unreliable. This paper presents a pitch tracking method that incorporated in a model-based separation framework. Multiple pitch estimation is simplified into single pitch estimation by segregating the source envelope from mixture spectrum with statistics of familiar speech patterns. Comprehensive experiments have compared the proposed tracking method with a recently reported multiple pitch estimator and its modified version equipped with ideal pitch cues. Lower estimation errors are achieved. Furthermore, this approach is applicable to other model-based frameworks as well.","PeriodicalId":291958,"journal":{"name":"2008 6th International Symposium on Chinese Spoken Language Processing","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134580085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Exploring Tone Variations in Chinese Dialects Using Context Dependent Tone Models 用语境关联声调模型研究汉语方言的声调变化
2008 6th International Symposium on Chinese Spoken Language Processing Pub Date : 2008-12-30 DOI: 10.1109/CHINSL.2008.ECP.106
Wei Guo, Min Chu
{"title":"Exploring Tone Variations in Chinese Dialects Using Context Dependent Tone Models","authors":"Wei Guo, Min Chu","doi":"10.1109/CHINSL.2008.ECP.106","DOIUrl":"https://doi.org/10.1109/CHINSL.2008.ECP.106","url":null,"abstract":"In this paper we propose a statistical approach for prosody study. It has two key stages: first, context dependent models are trained automatically from a natural speech corpus. Then data mining and data visualization techniques are used to discover phonetic knowledge from the model parameters. We use this approach to study the tonal system of Xi'an dialect, where we learn much knowledge about the dialect, without the need of any specific phonetic annotations about it. Some of our observations coincide with those from other studies, which demonstrate the capability of this approach, while some of our new findings show its advantages comparing with traditional methods.","PeriodicalId":291958,"journal":{"name":"2008 6th International Symposium on Chinese Spoken Language Processing","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116586142","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Pronunciation Error Detection for Computer Assisted Pronunciation Teaching in Mandarin 普通话计算机辅助语音教学中的语音错误检测
2008 6th International Symposium on Chinese Spoken Language Processing Pub Date : 2008-12-30 DOI: 10.1109/CHINSL.2008.ECP.98
Min-Siong Liang, Jian-Yung Hung, Ren-Yuan Lyu, Yuang-Chin Chiang
{"title":"Pronunciation Error Detection for Computer Assisted Pronunciation Teaching in Mandarin","authors":"Min-Siong Liang, Jian-Yung Hung, Ren-Yuan Lyu, Yuang-Chin Chiang","doi":"10.1109/CHINSL.2008.ECP.98","DOIUrl":"https://doi.org/10.1109/CHINSL.2008.ECP.98","url":null,"abstract":"In this paper, we provided a strategy of error detection of pronunciation and applied it to the computer-assisted pronunciation teaching(CAPT), especially in Mandarin language learning. In our system, it can be divided into two parts: the sentence verification(SV) and syllable identification(SI). First was used to ban out-of-task sentences. We used the likelihood ratio test, which was computed between the maximum probability of a result under two different hypotheses, i.e. null hypothesis and alternative hypothesis models, to verify the deviation degree and decide whether the student pronunciation is out-of-task. In SV part, the experimental results was significant and had 91.0% rate of F-score. The second part was applied to recognize the content of speech read by the speaker. The recognition net was built as a sausage shape with pronunciation confusion table corresponding to confusion error patterns. Then, the system could find out the wrong pronounced syllable for the appropriate feedback to correct the pronunciation of the users. In the stage of SI, the best detection rate had a F-score rate of 77.2%.","PeriodicalId":291958,"journal":{"name":"2008 6th International Symposium on Chinese Spoken Language Processing","volume":"81 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131133594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A Pitch Synchronous Method for Speech Modification 一种语音修饰的基音同步方法
2008 6th International Symposium on Chinese Spoken Language Processing Pub Date : 2008-12-30 DOI: 10.1109/chinsl.2008.ecp.73
Chih-Ting Kuo, Hsiao-Chuan Wang
{"title":"A Pitch Synchronous Method for Speech Modification","authors":"Chih-Ting Kuo, Hsiao-Chuan Wang","doi":"10.1109/chinsl.2008.ecp.73","DOIUrl":"https://doi.org/10.1109/chinsl.2008.ecp.73","url":null,"abstract":"The speech modification is a mechanism of changing speech characteristics and prosody for some specific applications. It is used in voice conversion, pronunciation correction, tone perception, and language learning. The most important part is the change of pitch in an utterance. Pitch extraction is an essential process for speech modification. This paper presents an efficient pitch extraction algorithm based on the normalized second standard deviation function (NSSDF) of magnitude difference. A pitch synchronous method for modifying speaking rate and pitch trajectory is proposed. The speaking rate is modified by inserting or deleting pitch periods in voiced segments. The pitch trajectory change is accomplished by modifying the pitch period of residual signal obtained from pitch synchronous linear prediction (LP) analysis and reconstructing speech signal by LP filter. A speech modification system is developed for Mandarin perception which is used to help hearing impaired students in pronunciation learning.","PeriodicalId":291958,"journal":{"name":"2008 6th International Symposium on Chinese Spoken Language Processing","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134069467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信