International Symposium on Chinese Spoken Language Processing最新文献

筛选
英文 中文
The modeling of tongue tip in Standard Chinese using MRI 标准汉语舌尖的MRI建模
International Symposium on Chinese Spoken Language Processing Pub Date : 2014-10-27 DOI: 10.1109/ISCSLP.2014.6936625
Gaowu Wang, J. Dang, Jiangping Kong
{"title":"The modeling of tongue tip in Standard Chinese using MRI","authors":"Gaowu Wang, J. Dang, Jiangping Kong","doi":"10.1109/ISCSLP.2014.6936625","DOIUrl":"https://doi.org/10.1109/ISCSLP.2014.6936625","url":null,"abstract":"In this paper, the tongue tip was modeled based on the articulatory data from MRI images in Standard Chinese. First, the MRI articulatory database of Standard Chinese, including 9 vowels and 75 consonant variants, were established. Second, Principle Component Analysis (PCA) was performed on the tongue shape to find articulatory factors, and the result showed that it would be more precise and concise when the tongue was divided as the tongue tip and tongue body and modeled separately. Finally, according to this result, the tongue tip was modeled by two articulatory parameters: Tongue Tip Protrude and Tongue Tip Raise, which represents the protruding/advancing and raising/retroflexing movements of the tongue tip.","PeriodicalId":271277,"journal":{"name":"International Symposium on Chinese Spoken Language Processing","volume":"85 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120863883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Speaker adaptation of speaking rate-dependent hierarchical prosodic model for Mandarin TTS 基于语速的汉语TTS分层韵律模型的说话人自适应
International Symposium on Chinese Spoken Language Processing Pub Date : 2014-10-27 DOI: 10.1109/ISCSLP.2014.6936616
Po-Chun Wang, I-Bin Liao, Chen-Yu Chiang, Yih-Ru Wang, Sin-Horng Chen
{"title":"Speaker adaptation of speaking rate-dependent hierarchical prosodic model for Mandarin TTS","authors":"Po-Chun Wang, I-Bin Liao, Chen-Yu Chiang, Yih-Ru Wang, Sin-Horng Chen","doi":"10.1109/ISCSLP.2014.6936616","DOIUrl":"https://doi.org/10.1109/ISCSLP.2014.6936616","url":null,"abstract":"In this paper, a speaker adaptation method to adapt an existing speaking rate-dependent hierarchical prosodic model (SR-HPM) of an SR-controlled Mandarin TTS system to new speaker's data for realizing a new voice is proposed. Two main problems are addressed: data sparseness for few adaptation utterances existing only in a small range of normal speaking rate and no adaptation data in both ranges of fast and slow speaking rates. The proposed method follows the idea of SR-HPM training to firstly normalize the prosodic-acoustic features of the new speaker's speech data, to then train an HPM by the prosody labeling and modeling algorithm, and to lastly refine the HPM to an SR-dependent model. The MAP adaptation method with model parameter extrapolation is applied to cope with the above two problems. Experimental results on a male speaker's adaptation data confirmed that the resulting adaptive SR-HPM has reasonable parameters covering a wide range of speaking rates and hence can be used in the TTS system to generate prosodic-acoustic features for synthesizing the new speaker's voice of any given SR.","PeriodicalId":271277,"journal":{"name":"International Symposium on Chinese Spoken Language Processing","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121115668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
How to describe speech emotion more completely - An investigation on Chinese broadcast news speech 如何更完整地描述言语情感——对中国广播新闻言语的调查
International Symposium on Chinese Spoken Language Processing Pub Date : 2012-12-01 DOI: 10.1109/ISCSLP.2012.6423508
Yingying Gao, Weibin Zhu
{"title":"How to describe speech emotion more completely - An investigation on Chinese broadcast news speech","authors":"Yingying Gao, Weibin Zhu","doi":"10.1109/ISCSLP.2012.6423508","DOIUrl":"https://doi.org/10.1109/ISCSLP.2012.6423508","url":null,"abstract":"A multi-perspective describing method for emotional speech is proposed. Three perspectives - cognitive, psychological, physiological ones and seven phonetic evaluation features are involved, with which it is expected to describe speech emotion more completely and to bridge the gap between emotions and acoustic signals. A series of experiments including cognition tests and corpus annotation are implemented. Although preliminarily, the results still reveal the effectiveness of the method.","PeriodicalId":271277,"journal":{"name":"International Symposium on Chinese Spoken Language Processing","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114465894","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Acoustic modeling for native and non-native Mandarin speech recognition 母语和非母语普通话语音识别的声学建模
International Symposium on Chinese Spoken Language Processing Pub Date : 2012-12-01 DOI: 10.1109/ISCSLP.2012.6423544
Xin Chen, Jian Cheng
{"title":"Acoustic modeling for native and non-native Mandarin speech recognition","authors":"Xin Chen, Jian Cheng","doi":"10.1109/ISCSLP.2012.6423544","DOIUrl":"https://doi.org/10.1109/ISCSLP.2012.6423544","url":null,"abstract":"In this paper, we first described the automatic Spoken Chinese Test (SCT). With a large amount of native and non-native data collected for SCT, different training strategies for acoustic modeling were investigated. Evaluations were performed on native as well as non-native datasets. We discovered that directly combining native and non-native data to train acoustic models did not work well, and the acoustic model trained only on native data achieved better performance when applying to non-native speech. We investigated how to use non-native data effectively, and found that Phonetic Decision Tree (PDT) had a great impact. Discriminative training was found to improve speech recognition accuracy effectively for both native and non-native Mandarin speech.","PeriodicalId":271277,"journal":{"name":"International Symposium on Chinese Spoken Language Processing","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121374318","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Development of an articulatory visual-speech synthesizer to support language learning 一种支持语言学习的发音视觉语音合成器的开发
International Symposium on Chinese Spoken Language Processing Pub Date : 2010-11-01 DOI: 10.1109/ISCSLP.2010.5684832
Ka Ho WONG, Wai-Kim Leung, W. Lo, H. Meng
{"title":"Development of an articulatory visual-speech synthesizer to support language learning","authors":"Ka Ho WONG, Wai-Kim Leung, W. Lo, H. Meng","doi":"10.1109/ISCSLP.2010.5684832","DOIUrl":"https://doi.org/10.1109/ISCSLP.2010.5684832","url":null,"abstract":"This paper presents a two-dimensional (2D) visual-speech synthesizer to support language learning. A visual-speech synthesizer animates the human articulators in synchronization with speech signals, e.g., output from a text-to-speech synthesizer. A visual-speech animation can offer a concrete illustration to the language learners on how to move and where to place the articulators when pronouncing a phoneme. We adopt a 2D vector-based viseme models and compiled a collection of visemes to cover the articulation of all English phonemes (42 visemes for the 44 English phonemes). Morphing between properly selected vector-based articulation images achieves articulatory animations. In this way, we have developed an articulatory visual speech synthesizer that can accept free-text input and synthesize articulatory dynamics in real-time. Evaluation involving 32 subjects based on “lip-reading” shows that they can identify the appropriate word(s) based on articulation animation alone nearly ∼80% of the time","PeriodicalId":271277,"journal":{"name":"International Symposium on Chinese Spoken Language Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130784605","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Towards Automatic Tone Correction in Non-native Mandarin 非母语普通话语音自动校正研究
International Symposium on Chinese Spoken Language Processing Pub Date : 2006-12-13 DOI: 10.1007/11939993_62
Mitchell Peabody, S. Seneff
{"title":"Towards Automatic Tone Correction in Non-native Mandarin","authors":"Mitchell Peabody, S. Seneff","doi":"10.1007/11939993_62","DOIUrl":"https://doi.org/10.1007/11939993_62","url":null,"abstract":"","PeriodicalId":271277,"journal":{"name":"International Symposium on Chinese Spoken Language Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125848772","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 25
Nonlinear Emotional Prosody Generation and Annotation 非线性情感韵律生成与标注
International Symposium on Chinese Spoken Language Processing Pub Date : 2006-12-13 DOI: 10.1007/11939993_23
J. Tao, Jian Yu, Yongguo Kang
{"title":"Nonlinear Emotional Prosody Generation and Annotation","authors":"J. Tao, Jian Yu, Yongguo Kang","doi":"10.1007/11939993_23","DOIUrl":"https://doi.org/10.1007/11939993_23","url":null,"abstract":"","PeriodicalId":271277,"journal":{"name":"International Symposium on Chinese Spoken Language Processing","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115071017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Pitch Mean Based Frequency Warping 基于基音平均值的频率翘曲
International Symposium on Chinese Spoken Language Processing Pub Date : 2006-12-13 DOI: 10.1007/11939993_13
Jian Liu, T. Zheng, Wenhu Wu
{"title":"Pitch Mean Based Frequency Warping","authors":"Jian Liu, T. Zheng, Wenhu Wu","doi":"10.1007/11939993_13","DOIUrl":"https://doi.org/10.1007/11939993_13","url":null,"abstract":"","PeriodicalId":271277,"journal":{"name":"International Symposium on Chinese Spoken Language Processing","volume":"120 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129314000","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Meeting Segmentation Using Two-Layer Cascaded Subband Filters 采用双层级联子带滤波器的会议分割
International Symposium on Chinese Spoken Language Processing Pub Date : 2006-12-13 DOI: 10.1007/11939993_68
M. Giuliani, T. Nwe, Haizhou Li
{"title":"Meeting Segmentation Using Two-Layer Cascaded Subband Filters","authors":"M. Giuliani, T. Nwe, Haizhou Li","doi":"10.1007/11939993_68","DOIUrl":"https://doi.org/10.1007/11939993_68","url":null,"abstract":"","PeriodicalId":271277,"journal":{"name":"International Symposium on Chinese Spoken Language Processing","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121173966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Language Identification by Using Syllable-Based Duration Classification on Code-Switching Speech 基于音节时长分类的语码转换语音语言识别
International Symposium on Chinese Spoken Language Processing Pub Date : 2006-12-13 DOI: 10.1007/11939993_50
Dau-Cheng Lyu, Ren-Yuan Lyu, Yuang-Chin Chiang, Chun-Nan Hsu
{"title":"Language Identification by Using Syllable-Based Duration Classification on Code-Switching Speech","authors":"Dau-Cheng Lyu, Ren-Yuan Lyu, Yuang-Chin Chiang, Chun-Nan Hsu","doi":"10.1007/11939993_50","DOIUrl":"https://doi.org/10.1007/11939993_50","url":null,"abstract":"","PeriodicalId":271277,"journal":{"name":"International Symposium on Chinese Spoken Language Processing","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115331991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信