2008 6th International Symposium on Chinese Spoken Language Processing最新文献

筛选
英文 中文
Discriminative Feedback Adaptation for GMM-UBM Speaker Verification GMM-UBM说话人验证的判别反馈自适应
2008 6th International Symposium on Chinese Spoken Language Processing Pub Date : 2008-12-30 DOI: 10.1109/CHINSL.2008.ECP.54
Yi-Hsiang Chao, Wei-Ho Tsai, H. Wang
{"title":"Discriminative Feedback Adaptation for GMM-UBM Speaker Verification","authors":"Yi-Hsiang Chao, Wei-Ho Tsai, H. Wang","doi":"10.1109/CHINSL.2008.ECP.54","DOIUrl":"https://doi.org/10.1109/CHINSL.2008.ECP.54","url":null,"abstract":"The GMM-UBM system is the current state-of-the-art approach for text-independent speaker verification. The advantage of the approach is that both target speaker model and impostor model (UBM) have generalization ability to handle \"unseen\" acoustic patterns. However, since GMM-UBM uses a common anti-model, namely UBM, for all target speakers, it tends to be weak in rejecting impostors' voices that are similar to the target speaker's voice. To overcome this limitation, we propose a discriminative feedback adaptation (DFA) framework that reinforces the discriminability between the target speaker model and the anti- model, while preserves the generalization ability of the GMM-UBM approach. This is done by adapting the UBM to a target-speaker- dependent anti-model based on a minimum verification squared- error criterion, rather than estimating from scratch by applying the conventional discriminative training schemes. The results of experiments conducted on the NTST2001-SRE database show that DFA substantially improves the performance of the conventional GMM-UBM approach.","PeriodicalId":291958,"journal":{"name":"2008 6th International Symposium on Chinese Spoken Language Processing","volume":"6 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121005703","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Speech Database Compacted for an Embedded Mandarin TTS System 嵌入式普通话TTS系统的语音数据库压缩
2008 6th International Symposium on Chinese Spoken Language Processing Pub Date : 2008-12-30 DOI: 10.1109/CHINSL.2008.ECP.74
Qing Guo, Bin Wang, N. Katae
{"title":"Speech Database Compacted for an Embedded Mandarin TTS System","authors":"Qing Guo, Bin Wang, N. Katae","doi":"10.1109/CHINSL.2008.ECP.74","DOIUrl":"https://doi.org/10.1109/CHINSL.2008.ECP.74","url":null,"abstract":"In recent years, the unit selection based concatenative speech synthesis system that uses large speech database has become popular because it can produce high quality synthesized speech. However, using such a large speech database is not practical for many applications such as those ported on embedded devices with the storage requirement and the computational complexity involved in searching it. In this paper, it proposed the context based pruning algorithm and waveform adjustment effect based pruning algorithm to compact the speech database. At last, it presents experimental results and discussion.","PeriodicalId":291958,"journal":{"name":"2008 6th International Symposium on Chinese Spoken Language Processing","volume":"152 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125242506","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
How Syllables Group in Chinese 汉语中的音节是如何组合的
2008 6th International Symposium on Chinese Spoken Language Processing Pub Date : 2008-12-30 DOI: 10.1109/CHINSL.2008.ECP.88
Maolin Wang, Yi Xu
{"title":"How Syllables Group in Chinese","authors":"Maolin Wang, Yi Xu","doi":"10.1109/CHINSL.2008.ECP.88","DOIUrl":"https://doi.org/10.1109/CHINSL.2008.ECP.88","url":null,"abstract":"In connected speech, syllables are believed to be prosodically organized into groups. Such groups are thought of as either the basic units of speech rhythm or a prosodic hierarchy. In this study we investigated the nature of syllable organization by examining syllable duration, tonal undershoot and FO height in Chinese as related to speaking mode, group position in sentence, syllable position in group, and number of syllables in the group. The results showed polysyllabic shortening and constituent-edge lengthening previously reported for languages with lexical stress despite the fact that no lexical stress was involved in the present study.","PeriodicalId":291958,"journal":{"name":"2008 6th International Symposium on Chinese Spoken Language Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129938452","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deriving MFCC Parameters from the Dynamic Spectrum for Robust Speech Recognition 基于动态频谱的MFCC参数鲁棒性语音识别
2008 6th International Symposium on Chinese Spoken Language Processing Pub Date : 2008-12-30 DOI: 10.1109/CHINSL.2008.ECP.33
Nengheng Zheng, Xia Li, Houwei Cao, Tan Lee, P. Ching
{"title":"Deriving MFCC Parameters from the Dynamic Spectrum for Robust Speech Recognition","authors":"Nengheng Zheng, Xia Li, Houwei Cao, Tan Lee, P. Ching","doi":"10.1109/CHINSL.2008.ECP.33","DOIUrl":"https://doi.org/10.1109/CHINSL.2008.ECP.33","url":null,"abstract":"State-of-the-art automatic speech recognition systems typically adopt the feature set containing mel-frequency cepstral coefficients (MFCC) and their time derivatives. The noise vulnerability of MFCC significantly degrades the recognition performance of such systems in noisy conditions. This paper describes a noise-robust feature extraction method. A set of new MFCC features is derived from the dynamic spectrum instead of the static spectrum as in the conventional MFCC feature extraction. It is shown that the dynamic spectrum preserves the spectral envelope information and, at the same time, is more noise resistant than the static spectrum. Experiments on Aurora 2 database show the noise robustness of the proposed features and it is preferable to replace MFCC with the new features in the state-of-the-art feature set.","PeriodicalId":291958,"journal":{"name":"2008 6th International Symposium on Chinese Spoken Language Processing","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130817732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Mandarin Speech Recognition for Nonnative Speakers Based on Pronunciation Dictionary Adaptation 基于语音词典自适应的普通话非母语语音识别
2008 6th International Symposium on Chinese Spoken Language Processing Pub Date : 2008-12-30 DOI: 10.1109/CHINSL.2008.ECP.66
Jian Yang, P. Wu, Dan Xu
{"title":"Mandarin Speech Recognition for Nonnative Speakers Based on Pronunciation Dictionary Adaptation","authors":"Jian Yang, P. Wu, Dan Xu","doi":"10.1109/CHINSL.2008.ECP.66","DOIUrl":"https://doi.org/10.1109/CHINSL.2008.ECP.66","url":null,"abstract":"Various techniques, such as acoustic model adaptation and pronunciation adaptation, have been reported to improve the recognition of nonnative or accented speech. In this paper, we propose to analyze the regular pairs of the pronunciation variation of the nonnative Mandarin speech spoken by Dai, Lisu and Naxi speakers from Yunnan. According typical pronunciation variations of these 3 accents, the more than one pronunciation for a part of words (i.e. tonal syllables or characters) have been inserted in the standard Mandarin pronunciation dictionary. The experiments show that an improvement is reached with the new dictionary and a simple 2-gram language model for all kinds of nonnative speakers.","PeriodicalId":291958,"journal":{"name":"2008 6th International Symposium on Chinese Spoken Language Processing","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133811175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Simplified Deformation Compensation for Emotional Speaker Recognition 情感说话人识别的简化变形补偿
2008 6th International Symposium on Chinese Spoken Language Processing Pub Date : 2008-12-30 DOI: 10.1109/CHINSL.2008.ECP.89
Yingchun Yang, Tian Wu, Hongbing Lv
{"title":"Simplified Deformation Compensation for Emotional Speaker Recognition","authors":"Yingchun Yang, Tian Wu, Hongbing Lv","doi":"10.1109/CHINSL.2008.ECP.89","DOIUrl":"https://doi.org/10.1109/CHINSL.2008.ECP.89","url":null,"abstract":"Emotional speaker recognition has been investigated by a number of researchers, however, all the current approaches had flaws in the requirement of a large amount of emotional speech from speakers during training and even the emotional state of a user during testing, which hinder the commercialization of speaker recognition technology. We propose our method from novel view of MFCC deformation caused by pitch deviation, named pitch deviation-based cepstrum compensation (PDCC), which take into account the correlation between glottis and vocal tract. Our method is applied to two emotional speech corpus EPS and MASC with absolute IR (identification rate) increase by 10.1% for the former and 4.12% for the latter, which are promising results .","PeriodicalId":291958,"journal":{"name":"2008 6th International Symposium on Chinese Spoken Language Processing","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130186731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Eigenchannel Compensation and Symmetric Score for Robust Text-Independent Speaker Verification 鲁棒文本无关说话人验证的特征信道补偿和对称分数
2008 6th International Symposium on Chinese Spoken Language Processing Pub Date : 2008-12-30 DOI: 10.1109/CHINSL.2008.ECP.92
Yuan Dong, Jian Zhao, Liang Lu, Jiqing Liu, Xianyu Zhao, Haila Wang
{"title":"Eigenchannel Compensation and Symmetric Score for Robust Text-Independent Speaker Verification","authors":"Yuan Dong, Jian Zhao, Liang Lu, Jiqing Liu, Xianyu Zhao, Haila Wang","doi":"10.1109/CHINSL.2008.ECP.92","DOIUrl":"https://doi.org/10.1109/CHINSL.2008.ECP.92","url":null,"abstract":"The negative effect of the session variability has become more and more severe for the performance of the speaker verification system. This paper discusses the eigenchannel compensation and investigates the symmetric scoring method to diminish the session variability and further enhance the performance. Experiments were conducted on the core tests of the 2006 and 2008 speaker recognition evaluation (SRE) corpuses of the national institute of standards and technology (NIST) respectively. The experimental results demonstrate that the eigenchannel compensation can achieve excellent improvement and the symmetric scoring, as a measurement of cross similarity, can further improve the performance moderately. Overall, the system performance can be significantly improved, with equal error rate from 9.74% to 5.08% , 47.8% on SRE06 corpus and from 16.26% to 9.42% , 42.1% on SRE08 corpus while detection cost function from 0.0456 to 0.0263 , 42.3% on SRE06 corpus and from 0.0692 to 0.0449 , 35.1% on SRE08 corpus.","PeriodicalId":291958,"journal":{"name":"2008 6th International Symposium on Chinese Spoken Language Processing","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114073942","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
An HMM Compensation Approach for Dynamic Features Using Unscented Transformation and its Application to Noisy Speech Recognition 基于Unscented变换的动态特征HMM补偿方法及其在含噪语音识别中的应用
2008 6th International Symposium on Chinese Spoken Language Processing Pub Date : 2008-12-30 DOI: 10.1109/CHINSL.2008.ECP.39
Yu Hu, Qiang Huo
{"title":"An HMM Compensation Approach for Dynamic Features Using Unscented Transformation and its Application to Noisy Speech Recognition","authors":"Yu Hu, Qiang Huo","doi":"10.1109/CHINSL.2008.ECP.39","DOIUrl":"https://doi.org/10.1109/CHINSL.2008.ECP.39","url":null,"abstract":"In our previous work, a new HMM compensation approach for static MFCC features was proposed by using a technique called Unscented Transformation (UT). Three implementations of the UT approach with different computational complexities were evaluated on Aurora2 connected digits database, and significant performance improvements were achieved compared to log-normal- approximation-based PMC (Parallel Model Combination) and first- order-approximation-based VTS (Vector Taylor Series) approaches. In this paper, we extend our UT-based formulation to compensating for HMM parameters corresponding to both static and dynamic features. New experimental results on Aurora2 task are reported to demonstrate the effectiveness of the proposed UT approach.","PeriodicalId":291958,"journal":{"name":"2008 6th International Symposium on Chinese Spoken Language Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116134882","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Language Model Adaptation for Relevance Feedback in Information Retrieval 信息检索中关联反馈的语言模型适应
2008 6th International Symposium on Chinese Spoken Language Processing Pub Date : 2008-12-30 DOI: 10.1109/CHINSL.2008.ECP.84
Ying-Lang Chang, Jen-Tzung Chien
{"title":"Language Model Adaptation for Relevance Feedback in Information Retrieval","authors":"Ying-Lang Chang, Jen-Tzung Chien","doi":"10.1109/CHINSL.2008.ECP.84","DOIUrl":"https://doi.org/10.1109/CHINSL.2008.ECP.84","url":null,"abstract":"Language model is a popular method of exploiting linguistic regularities for document retrieval. To improve retrieval performance, the scheme of relevance feedback is adopted by adjusting the query language model using the information feedback from the retrieved documents. This study presents a new Bayesian learning approach to instantaneous and unsupervised adaptation of language model for adaptive information retrieval. We aim to compensate the domain mismatch between query and documents by adapting the query language model to meet the domains of collected documents. The maximum a posteriori adaptation is executed solely by using the input query without additional collection of adaptation data. The retrieved top N documents are utilized as relevant documents and referred as feedback to estimate mixture of language models for Bayesian document retrieval. The experiments on using TREC datasets show that the proposed method significantly outperforms the other relevance feedback methods.","PeriodicalId":291958,"journal":{"name":"2008 6th International Symposium on Chinese Spoken Language Processing","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125199617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
What's in the F0 of Mandarin Speech: Tones, Intonation and Beyond 什么是普通话语音的F0:音调,语调和超越
2008 6th International Symposium on Chinese Spoken Language Processing Pub Date : 2008-12-30 DOI: 10.1109/CHINSL.2008.ECP.23
Chiu-yu Tseng, Zhao-yu Su
{"title":"What's in the F0 of Mandarin Speech: Tones, Intonation and Beyond","authors":"Chiu-yu Tseng, Zhao-yu Su","doi":"10.1109/CHINSL.2008.ECP.23","DOIUrl":"https://doi.org/10.1109/CHINSL.2008.ECP.23","url":null,"abstract":"We analyzed F0 contours of fluent Mandarin speech using a modified command-response model. Adopting the multiple-phrase speech paragraph as a discourse prosodic unit, we investigated the composition of FO contours to see whether additional prosodic information beyond tones and intonation exists. Testing FO contributions with a previously constructed prosody hierarchy the HPG (hierarchy of prosodic phrase grouping), results showed that tone identities only make up 40- 45% of output FO while other higher layers of information contributes to the rest. Final FO output is cumulative of all layers combined. The results thus provide an account of why prosodic context consists of both adjacent and cross-over associations and how global prosodic context is reflected in the formation of output FO. We believe these results shed new lights on speech technology development.","PeriodicalId":291958,"journal":{"name":"2008 6th International Symposium on Chinese Spoken Language Processing","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129676372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信