Han-Ping Shen, Chung-Hsien Wu, Yan-Ting Yang, C. Hsu
{"title":"CECOS: A Chinese-English code-switching speech database","authors":"Han-Ping Shen, Chung-Hsien Wu, Yan-Ting Yang, C. Hsu","doi":"10.1109/ICSDA.2011.6085992","DOIUrl":"https://doi.org/10.1109/ICSDA.2011.6085992","url":null,"abstract":"With the increase on the demands for code-switching automatic speech recognition (ASR), the design and development of a code-switching speech database becomes highly desirable. However, it is not easy to collect sufficient code-switched utterances for model training for code-switching ASR. This study presents the procedure and experience for the design and development of a Chinese-English COde-switching Speech database (CECOS). Two different methods for collecting Chinese-English code-switched utterances are employed in this work. The applications of the collected database are also introduced. The CECOS database not only contains the speech data with code-switch properties but also accents due to non-native speakers. This database can be applied to several applications, such as code-switching speech recognition, language identification, named entity detection, etc.","PeriodicalId":269402,"journal":{"name":"2011 International Conference on Speech Database and Assessments (Oriental COCOSDA)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123408465","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chien-Lin Huang, Yu Tsao, Chiori Hori, H. Kashioka
{"title":"Feature normalization and selection for robust speaker state recognition","authors":"Chien-Lin Huang, Yu Tsao, Chiori Hori, H. Kashioka","doi":"10.1109/ICSDA.2011.6085988","DOIUrl":"https://doi.org/10.1109/ICSDA.2011.6085988","url":null,"abstract":"In this paper, we propose an integration process of feature compensation and selection on the collective acoustic feature sets to derive a set of advanced acoustic features for speaker state recognition. For feature normalization, we perform a two-dimensional histogram equalization (2-D HEQ) normalization to reduce variability of speaker and speaking environment factors. For feature selection, we apply a principal component analysis (PCA)-based feature selection to extract meaningful parameters from the original acoustic feature sets and to eliminate redundant components. We conducted experiments on Alcohol Language Corpus (ALC) and Sleepy Language Corpus (SLC) provided in INTERSPEECH 2011 Speaker State Challenge. The openSMILE toolkit is used to extract acoustic features of low-level-descriptors and their related functionals. Experimental results show that the derived acoustic feature set, processed by 2-D HEQ normalization and PCA-based selection, gives improvements over the original feature sets. The results verify that the derived acoustic feature set is a discriminative and compact representation that efficiently exploits multiple knowledge sources from the ensemble acoustic feature sets.","PeriodicalId":269402,"journal":{"name":"2011 International Conference on Speech Database and Assessments (Oriental COCOSDA)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122632524","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Intonation patterns of exclamations of Chinese EFL learners from Zhenjiang","authors":"Pengfei Shao, Yuan Jia, Ai-jun Li","doi":"10.1109/ICSDA.2011.6085983","DOIUrl":"https://doi.org/10.1109/ICSDA.2011.6085983","url":null,"abstract":"The present study investigates Chinese EFL (English as a Foreign Language) learners' intonation pattern of exclamatory sentences. The results show that both Chinese EFL learners and English RP speakers (Received Pronunciation) adopt a falling (H*L) tone of a sentence to realize the exclamatory intonation. However, before the last H*L in an intonational phrase, a rising tone and the pitch contour enlargement can be observed. This study also shows that the final boundary tone of Chinese EFL learners is mostly realized as low (L%), and the same pattern can be found in English RP speakers, whereas the initial boundary tone of Chinese EFL learners is mostly high(H%). Moreover, Chinese EFL learners' pitch range of exclamations is wider than that of English RP speakers, and Chinese EFL learners intend to use more pauses in a sentence.","PeriodicalId":269402,"journal":{"name":"2011 International Conference on Speech Database and Assessments (Oriental COCOSDA)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116472081","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}