2008 6th International Symposium on Chinese Spoken Language Processing最新文献

筛选
英文 中文
A Two-Stage Multi-Feature Integration Approach to Unsupervised Speaker Change Detection in Real-Time News Broadcasting 实时新闻广播中无监督说话人变化检测的两阶段多特征集成方法
2008 6th International Symposium on Chinese Spoken Language Processing Pub Date : 2008-12-30 DOI: 10.1109/CHINSL.2008.ECP.99
Lei Xie, Guangsen Wang
{"title":"A Two-Stage Multi-Feature Integration Approach to Unsupervised Speaker Change Detection in Real-Time News Broadcasting","authors":"Lei Xie, Guangsen Wang","doi":"10.1109/CHINSL.2008.ECP.99","DOIUrl":"https://doi.org/10.1109/CHINSL.2008.ECP.99","url":null,"abstract":"This paper presents a two-stage multi-feature integration approach for unsupervised speaker change detection in real-time news broadcasting. We integrate MFCC and LSP features (i.e. a perceptual feature plus a articulatory feature) in the metric-based potential speaker change detection stage to collect speaker boundary candidates as many as possible. We adopt a weighted Bayesian information criterion (BIC) to integrate boundary decisions from MFCC and LSP features in the speaker boundary confirmation stage. This multi-feature integration strategy makes use of the complementarity between perceptual features and articulatory features to achieve a performance gain. Speaker change detection experiments show that the multi- feature integration approach significantly outperforms the individual features with relative improvements of 26% over the LSP-only approach and 6% over the MFCC-only approach.","PeriodicalId":291958,"journal":{"name":"2008 6th International Symposium on Chinese Spoken Language Processing","volume":"133 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132646840","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Word Order Correction for Language Transfer Using Relative Position Language Modeling 基于相对位置语言模型的语言迁移词序校正
2008 6th International Symposium on Chinese Spoken Language Processing Pub Date : 2008-12-30 DOI: 10.1109/CHINSL.2008.ECP.20
Chao-Hong Liu, Chung-Hsien Wu, Matthew Harris
{"title":"Word Order Correction for Language Transfer Using Relative Position Language Modeling","authors":"Chao-Hong Liu, Chung-Hsien Wu, Matthew Harris","doi":"10.1109/CHINSL.2008.ECP.20","DOIUrl":"https://doi.org/10.1109/CHINSL.2008.ECP.20","url":null,"abstract":"Sentence correction has been an important and emerging issue in computer-assisted language learning. However, existing techniques based on grammar rules or statistical machine translation are still not robust enough to tackle the common incorrect word order errors in sentences produced by second language learners of Chinese. In this paper, a novel relative position language model is proposed to address this problem, for which a corpus of erroneous English-Chinese language transfer sentences along with their corrected counterparts is created and manually judged by human annotators. Experimental results show that compared to a scoring approach based on an n-gram language model and a phrase-based machine translation system, the performance in terms of BLEU scores of the proposed approach achieved improvements of 20.3% and 26.5% for the correction of word order errors resulting from language transfer, respectively.","PeriodicalId":291958,"journal":{"name":"2008 6th International Symposium on Chinese Spoken Language Processing","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130905661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
A Sample and Feature Selection Scheme for GMM-SVM Based Language Recognition 基于GMM-SVM的语言识别样本和特征选择方案
2008 6th International Symposium on Chinese Spoken Language Processing Pub Date : 2008-12-30 DOI: 10.1109/CHINSL.2008.ECP.93
Yan Song, Lirong Dai
{"title":"A Sample and Feature Selection Scheme for GMM-SVM Based Language Recognition","authors":"Yan Song, Lirong Dai","doi":"10.1109/CHINSL.2008.ECP.93","DOIUrl":"https://doi.org/10.1109/CHINSL.2008.ECP.93","url":null,"abstract":"Discriminative training for language recognition has been a key tool for improving system performance. SVM-based algorithms (i.e. GMM-SVM, GLDS-SVM etc.) are important ones for language recognition. The core of these algorithms is to construct the kernel for comparing the similarity of two sequences. It is known that the mismatch between training and test condition will degrade the performance. In this paper, we proposed a novel sample and feature selection scheme under the GMM-SVM framework, which aims at alleviating the duration mismatch problem. The proposed method is evaluated on NIST 03 and 07 language recognition evaluation tasks with improvement over prior techniques.","PeriodicalId":291958,"journal":{"name":"2008 6th International Symposium on Chinese Spoken Language Processing","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126975108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A New Similarity Measure Between HMMS 一种新的hmm间相似性度量方法
2008 6th International Symposium on Chinese Spoken Language Processing Pub Date : 2008-12-30 DOI: 10.1109/CHINSL.2008.ECP.67
Yih-Ru Wang
{"title":"A New Similarity Measure Between HMMS","authors":"Yih-Ru Wang","doi":"10.1109/CHINSL.2008.ECP.67","DOIUrl":"https://doi.org/10.1109/CHINSL.2008.ECP.67","url":null,"abstract":"In this paper, a new similarity measure between HMM models which extended the well-known Kullback-Leibler distance was proposed. The Kullback-Leibler distance was defined as the mean of log-likelihood ratio (LLR) in a hypotheses test and the Kullback-Leibler distance was frequently used as a similarity measure for HMM models. Here, the standard deviation of LLR between HMM models was deviated first. Besides, the ratio of mean and standard variation of LLR was used as a new similarity measure between HMM models. Experiments were done in a Mandarin speech database, TCC-300, in order to check the effectiveness of the proposed similarity measure. The accuracy of the standard deviation of LLR estimated from the syllable HMM models was checked by comparison with the standard deviation of LLR of top-10 candidates found from HMM decoder. And, the confusion sets of 411 syllables were also found by using both the KL distance and the proposed similarity measure. Comparing to the top-10 confusion models, 94.9% and 95.3% inclusion rates can be achieved by using KL distance and the proposed similarity measure of HMM models.","PeriodicalId":291958,"journal":{"name":"2008 6th International Symposium on Chinese Spoken Language Processing","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115947339","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Efficient System Combination for Syllable-Confusion-Network-Based Chinese Spoken Term Detection 基于音节混淆网络的汉语口语词汇检测系统组合
2008 6th International Symposium on Chinese Spoken Language Processing Pub Date : 2008-12-30 DOI: 10.1109/CHINSL.2008.ECP.103
Jie Gao, Qingwei Zhao, Yonghong Yan, J. Shao
{"title":"Efficient System Combination for Syllable-Confusion-Network-Based Chinese Spoken Term Detection","authors":"Jie Gao, Qingwei Zhao, Yonghong Yan, J. Shao","doi":"10.1109/CHINSL.2008.ECP.103","DOIUrl":"https://doi.org/10.1109/CHINSL.2008.ECP.103","url":null,"abstract":"This paper examines the system combination issue for syllable-confusion-network (SCN)-based Chinese spoken term detection (STD). System combination for STD usually leads to improvements in accuracy but suffers from increased index size or complicated index structure. This paper explores methods for efficient combination of a word-based system and a syllable-based system while keeping the compactness of the indices. First, a composite SCN is generated using two approaches: lattice combination (The SCN is generated from a combined lattice) and confusion network combination (Two SCNs are combined into one). Then a simple compact index is constructed from this composite SCN by merging cross-system redundant information. The experimental result on a 60-hour corpus shows a relative accuracy improvement of 14.7% is achieved over the baseline syllable-based system. Meanwhile, it reduces the index size by 22.3% compared to the commonly adopted score combination method when achieves comparable accuracy.","PeriodicalId":291958,"journal":{"name":"2008 6th International Symposium on Chinese Spoken Language Processing","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115109923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Microphone Array Post-Filter Based on Auditory Filtering 基于听觉滤波的麦克风阵列后置滤波
2008 6th International Symposium on Chinese Spoken Language Processing Pub Date : 2008-12-30 DOI: 10.1109/CHINSL.2008.ECP.105
Peng Li, FengChai Liao, Ning Cheng, Bo Xu, Wenju Liu
{"title":"Microphone Array Post-Filter Based on Auditory Filtering","authors":"Peng Li, FengChai Liao, Ning Cheng, Bo Xu, Wenju Liu","doi":"10.1109/CHINSL.2008.ECP.105","DOIUrl":"https://doi.org/10.1109/CHINSL.2008.ECP.105","url":null,"abstract":"In this paper, an auditory filtering based microphone array post-filter is proposed to enhance the quality of the output signal. By using a gammatone filterbank to band pass each input of the array, the input signals are decomposed into a two-dimensional T-F representation. Then, for each auditory filter channel, the post-filter's coefficients are estimated in each frame using the decomposed multi-channel input signals. Followed by the post-filtering and synthesis processing, the enhanced speech with better quality is acquired. Systematical evaluations on the CMU microphone array database prove that the proposed method could improve not only the noise reduction measure but also the speech quality measures.","PeriodicalId":291958,"journal":{"name":"2008 6th International Symposium on Chinese Spoken Language Processing","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114948969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mandarin Tone Perception with Temporal Envelope and Periodicity Cues from Different Frequency Regions 基于时间包络和不同频率区域周期线索的普通话声调感知
2008 6th International Symposium on Chinese Spoken Language Processing Pub Date : 2008-12-30 DOI: 10.1109/CHINSL.2008.ECP.96
Meng Yuan, Tan Lee, S. Soli
{"title":"Mandarin Tone Perception with Temporal Envelope and Periodicity Cues from Different Frequency Regions","authors":"Meng Yuan, Tan Lee, S. Soli","doi":"10.1109/CHINSL.2008.ECP.96","DOIUrl":"https://doi.org/10.1109/CHINSL.2008.ECP.96","url":null,"abstract":"Temporal envelope and periodicity cues (TEPC) are crucial for speech perception of hearing-impaired people who have poor frequency selectivity. This paper investigates the contributions of TEPCs extracted from different frequency regions to lexical tone perception of Mandarin. Tone identification tests were carried out with tone-contrasting monosyllabic and disyllabic words. Normal- hearing subjects were recruited in the psychoacoustic experiments with acoustic stimuli that simulate the output of a cochlear implant. The results show that tone identification accuracy with sub-band TEPCs is consistently higher for male voice than for female voice. TEPCs from sub-bands above 1 kHz are found to contribute more to tone identification than those from sub-bands below 1 kHz, especially for male voice. Tone recognition performance can be improved by simply removing the low-frequency TEPCs. The same findings were obtained in our previous study on Cantonese tone perception. This suggests that emphasizing high-frequency TEPCs may be an effective strategy to improve speech perception of tonal languages.","PeriodicalId":291958,"journal":{"name":"2008 6th International Symposium on Chinese Spoken Language Processing","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129894208","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Mandarin Learning Using Speech and Language Technologies: A Translation Game in the Travel Domain 使用语音和语言技术学习普通话:旅游领域的翻译游戏
2008 6th International Symposium on Chinese Spoken Language Processing Pub Date : 2008-12-30 DOI: 10.1109/CHINSL.2008.ECP.19
Yushi Xu, S. Seneff
{"title":"Mandarin Learning Using Speech and Language Technologies: A Translation Game in the Travel Domain","authors":"Yushi Xu, S. Seneff","doi":"10.1109/CHINSL.2008.ECP.19","DOIUrl":"https://doi.org/10.1109/CHINSL.2008.ECP.19","url":null,"abstract":"This paper describes a new Web-based translation game we have designed to help a student learn spoken Chinese. The student talks to the system in Chinese and the system compares the recognized sentence against a set of English prompts to judge whether it is a suitable translation of any one of them. The game can also provide translation assistance upon request. The game was developed using the IWSLT corpus of utterances in the tourist domain, and is oriented towards helping the student communicate effectively during foreign travel. In a preliminary evaluation, the system performed correctly on over 90% of test utterances. The system received positive feedback from the subjects.","PeriodicalId":291958,"journal":{"name":"2008 6th International Symposium on Chinese Spoken Language Processing","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127512796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Heteronym Verification for Mandarin Speech Synthesis 普通话语音合成的异义词验证
2008 6th International Symposium on Chinese Spoken Language Processing Pub Date : 2008-12-30 DOI: 10.1109/CHINSL.2008.ECP.46
Heng Lu, Zhenhua Ling, Si Wei, Yu Hu, Lirong Dai, Ren-Hua Wang
{"title":"Heteronym Verification for Mandarin Speech Synthesis","authors":"Heng Lu, Zhenhua Ling, Si Wei, Yu Hu, Lirong Dai, Ren-Hua Wang","doi":"10.1109/CHINSL.2008.ECP.46","DOIUrl":"https://doi.org/10.1109/CHINSL.2008.ECP.46","url":null,"abstract":"Accurate phonetic transcription of speech corpus is critical to high quality speech synthesis. In Mandarin text-to-speech (MTTS) system, one major problem of automatically labeling the database is the heteronym annotation. Because in Mandarin, there are some single-character words or multi-character words have more than one pronunciation. In this paper, a heteronym annotation verification method for MTTS database labeling is proposed. By training contextual dependent HMMs and calculating the log likelihood ratio, each heteronym in the database is assigned a confidence score and those below the threshold are selected for manual inspecting. We divide heteronyms in Mandarin into two categories and different features are used for each category. The result of our experiment on an artificial test set has shown that we can achieve EER (equal error rate) of 7.9% and 11.9% for these two categories. Further test on an actual database which contains a total of 36098 heteronyms has shown that the proposed method can find 89 of all 123 annotation errors by only inspecting 639 polyphones.","PeriodicalId":291958,"journal":{"name":"2008 6th International Symposium on Chinese Spoken Language Processing","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122261392","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Improved Semi-Parametric Mean Trajectory Model Using Discriminatively Trained Centroids 基于判别训练质心的改进半参数平均轨迹模型
2008 6th International Symposium on Chinese Spoken Language Processing Pub Date : 2008-12-30 DOI: 10.1109/CHINSL.2008.ECP.63
Ran Xu, Jielin Pan, Yonghong Yan
{"title":"Improved Semi-Parametric Mean Trajectory Model Using Discriminatively Trained Centroids","authors":"Ran Xu, Jielin Pan, Yonghong Yan","doi":"10.1109/CHINSL.2008.ECP.63","DOIUrl":"https://doi.org/10.1109/CHINSL.2008.ECP.63","url":null,"abstract":"In order to alleviate the limitation of \"state output probability conditional independence\" assumption held by Hidden Markov models (HMMs) in speech recognition, a discriminative semi-parametric trajectory model was proposed in recent years, in which both means and variances in the acoustic models are modeled as time-varying variables. The time- varying information is modeled as a weighted contribution from all the \"centroids\", which can be viewed as the representation of the acoustic space. In previous literatures, such centroids are often obtained by clustering the Gaussians in the baseline acoustic models to some reasonable number or by training a baseline model with fewer Gaussian components. The centroids obtained in this way are maximum likelihood estimation of the acoustic space, which are relatively weak in discriminability compared to the discriminatively trained acoustic models. In this paper, we proposed an improved semi-parametric mean trajectory model training framework, in which the centroids are first discriminatively trained by minimum phone error criterion to provide a more discriminative representation of the acoustic space. This method was evaluated on the Mandarin digit string recognition task. The experimental result shows that our proposed method improves the recognition performance by a relative string error rate reduction of 7.5% compared to the traditional discriminative semi-parametric trajectory model, and it outperforms the baseline acoustic model trained with maximum likelihood criterion by a relative string error rate reduction of 28.6%.","PeriodicalId":291958,"journal":{"name":"2008 6th International Symposium on Chinese Spoken Language Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130443317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信