2009 Oriental COCOSDA International Conference on Speech Database and Assessments最新文献

筛选
英文 中文
Phonetic aspects of content design in AESOP (Asian English Speech cOrpus Project) 亚洲英语语音语料库项目中语音方面的内容设计
2009 Oriental COCOSDA International Conference on Speech Database and Assessments Pub Date : 2009-10-02 DOI: 10.1109/ICSDA.2009.5278376
T. Visceglia, Chiu-yu Tseng, M. Kondo, H. Meng, Y. Sagisaka
{"title":"Phonetic aspects of content design in AESOP (Asian English Speech cOrpus Project)","authors":"T. Visceglia, Chiu-yu Tseng, M. Kondo, H. Meng, Y. Sagisaka","doi":"10.1109/ICSDA.2009.5278376","DOIUrl":"https://doi.org/10.1109/ICSDA.2009.5278376","url":null,"abstract":"This research is part of the ongoing multinational collaboration “Asian English Speech cOrpus Project” (AESOP), whose aim is to build up an Asian English speech corpus representing the varieties of English spoken in Asia. AESOP is an international consortium of linguists, speech scientists, psychologists and educators from Japan, Taiwan, Hong Kong, China, Thailand, Indonesia and Mongolia. Its primary aim is to collect and compare Asian English speech corpora from the countries listed above in order to derive a set of core properties common to all varieties of Asian English, as well as to discover features that are particular to individual varieties. Each research team will use a common recording setup and share an experimental task set, and will develop a common, open-ended annotation system. Moreover, AESOP-collected corpora will be an open resource, available to the research community at large. The initial stage of the phonetics aspect of this project will be devoted to designing spoken-language tasks which will elicit production of a large range of English segmental and suprasegmental characteristics. These data will be used to generate a catalogue of acoustic characteristics particular to individual varieties of Asian English, which will then be compared with the data collected by other AESOP members in order to determine areas of overlap between L1 and L2 English as well as differences among varieties of Asian English.","PeriodicalId":254906,"journal":{"name":"2009 Oriental COCOSDA International Conference on Speech Database and Assessments","volume":"119 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131419125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 56
Development and application of multilingual speech translation 多语种语音翻译的发展与应用
2009 Oriental COCOSDA International Conference on Speech Database and Assessments Pub Date : 2009-10-02 DOI: 10.1109/ICSDA.2009.5278383
Satoshi Nakamura
{"title":"Development and application of multilingual speech translation","authors":"Satoshi Nakamura","doi":"10.1109/ICSDA.2009.5278383","DOIUrl":"https://doi.org/10.1109/ICSDA.2009.5278383","url":null,"abstract":"This paper describes the latest version of handheld speech-to-speech translation system developed by National Institute of Information and Communications Technology, NICT. As the entire speech-to-speech translation functions are implemented into one terminal, it realizes real-time and location free speech-to-speech translation service for many language pairs. A new noise-suppression technique notably improves speech recognition performance. Corpus-based approaches of recognition, translation, and synthesis enabled wide range coverage of topic varieties and portability to other languages. Currently, we mainly focus on translation between Japanese, English and Chinese.","PeriodicalId":254906,"journal":{"name":"2009 Oriental COCOSDA International Conference on Speech Database and Assessments","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121791061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Advances in Chinese Natural Language Processing and Language resources 汉语自然语言处理与语言资源研究进展
2009 Oriental COCOSDA International Conference on Speech Database and Assessments Pub Date : 2009-10-02 DOI: 10.1109/ICSDA.2009.5278384
J. Tao, Fang Zheng, Ai-jun Li, Ya Li
{"title":"Advances in Chinese Natural Language Processing and Language resources","authors":"J. Tao, Fang Zheng, Ai-jun Li, Ya Li","doi":"10.1109/ICSDA.2009.5278384","DOIUrl":"https://doi.org/10.1109/ICSDA.2009.5278384","url":null,"abstract":"In the past few years, there have been a significant number of activities in the area of Chinese Natural Language Processing (CNLP) including the language resource construction and assessment. This paper summarized the major tasks and key technologies in Natural Language Processing (NLP), which encompasses both text processing and speech processing by extension. The Chinese Language resources, including linguistic data, speech data, evaluation data and language toolkits which are elaborately constructed for CNLP related fields and some language resource consortiums are also introduced in this paper. Aimed to promote the development of corpus-based technologies, many resource consortiums commit themselves to collect, create and distribute many kinds of resources. The goal of these organizations is to set up a universal and well accepted Chinese resources database so that to push forward the CNLP.","PeriodicalId":254906,"journal":{"name":"2009 Oriental COCOSDA International Conference on Speech Database and Assessments","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125962213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Speech timing and cross-linguistic studies towards computational human modeling 面向计算人类建模的语音计时和跨语言研究
2009 Oriental COCOSDA International Conference on Speech Database and Assessments Pub Date : 2009-10-02 DOI: 10.1109/ICSDA.2009.5278386
Y. Sagisaka, H. Kato, M. Tsuzaki, Shizuka Nakamura, C. Hansakunbuntheung
{"title":"Speech timing and cross-linguistic studies towards computational human modeling","authors":"Y. Sagisaka, H. Kato, M. Tsuzaki, Shizuka Nakamura, C. Hansakunbuntheung","doi":"10.1109/ICSDA.2009.5278386","DOIUrl":"https://doi.org/10.1109/ICSDA.2009.5278386","url":null,"abstract":"In this paper, we introduce Japanese segmental duration characteristics and computational modeling that we have been studying for around three decades in speech synthesis. A series of experimental results are also shown on loudness dependence in the duration perception. These computational duration modeling and perceptual studies on duration error sensitivity to loudness give some insights for computational human modeling of spoken language capability. As a first trial to figure out how these findings could be efficiently employed in other field like language learning, we introduce our current efforts on the objective evaluation of 2nd language speaking skill and the research consortium of AESOP (Asian English Speech cOrpus Project) where researchers in Asian countries have started to work together.","PeriodicalId":254906,"journal":{"name":"2009 Oriental COCOSDA International Conference on Speech Database and Assessments","volume":"113 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124376438","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Intonation patterns of yes-no questions for Chinese EFL learners 中国英语学习者的是非题语调模式
2009 Oriental COCOSDA International Conference on Speech Database and Assessments Pub Date : 2009-10-02 DOI: 10.1109/ICSDA.2009.5278369
Xiaoli Ji, Xia Wang, Ai-jun Li
{"title":"Intonation patterns of yes-no questions for Chinese EFL learners","authors":"Xiaoli Ji, Xia Wang, Ai-jun Li","doi":"10.1109/ICSDA.2009.5278369","DOIUrl":"https://doi.org/10.1109/ICSDA.2009.5278369","url":null,"abstract":"The present study investigates Chinese EFL (English as a foreign language) learners' intonation pattern of yes-no questions on the basis of AM theory. According to our study, American speakers adopt a low-level (L*) or low rising tone (L*H) on nuclear accents no matter the nuclear accent is on the medial or final part of a sentence. By contrast, Chinese EFL learners apply a high-level (H*) or falling (H*L) tone when a nuclear accent falls on the medial part of a sentence but a falling (H*L) or low rising tone (L*H) when it is on the final part. The final boundary tone of Chinese EFL learners can be either high (H%) or low (L%) while American speakers mainly apply the H% boundary tone. Besides, Chinese EFL learners' pitch movements of nuclear accents in yes-no questions are similar to those of statements.","PeriodicalId":254906,"journal":{"name":"2009 Oriental COCOSDA International Conference on Speech Database and Assessments","volume":"53 6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131751773","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Construction of Chinese conversational corpora for spontaneous speech recognition and comparative study on the trilingual parallel corpora 面向自发语音识别的汉语会话语料库构建及三语平行语料库的比较研究
2009 Oriental COCOSDA International Conference on Speech Database and Assessments Pub Date : 2009-10-02 DOI: 10.1109/ICSDA.2009.5278375
Xinhui Hu, R. Isotani, Satoshi Nakamura
{"title":"Construction of Chinese conversational corpora for spontaneous speech recognition and comparative study on the trilingual parallel corpora","authors":"Xinhui Hu, R. Isotani, Satoshi Nakamura","doi":"10.1109/ICSDA.2009.5278375","DOIUrl":"https://doi.org/10.1109/ICSDA.2009.5278375","url":null,"abstract":"In this paper, we describe the development of Chinese conversational segmented and POS-tagged corpora currently used in the NICT/ATR speech-to-speech translation system. Over 500K manually checked utterances provide 3.5M words of Chinese corpora. As far as we know, they are the largest conversational textual corpora; in the domain of travel. A set of three parallel corpora is obtained with the corresponding pairs of Japanese and English words from which the Chinese words are translated. Based on these parallel corpora, we make an investigation on the statistics of each language, performances of language model and speech recognition, and find the differences among these languages. The problems and their solutions to the present Chinese corpora are also analyzed and discussed.","PeriodicalId":254906,"journal":{"name":"2009 Oriental COCOSDA International Conference on Speech Database and Assessments","volume":"1999 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132337610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Grapheme to Phoneme (G2P) conversion for Bangla 孟加拉语的字素到音素(G2P)转换
2009 Oriental COCOSDA International Conference on Speech Database and Assessments Pub Date : 2009-10-02 DOI: 10.1109/ICSDA.2009.5278373
Joyanta Basu, T. Basu, Mridusmita Mitra, Shyamal Kr. Das Mandal
{"title":"Grapheme to Phoneme (G2P) conversion for Bangla","authors":"Joyanta Basu, T. Basu, Mridusmita Mitra, Shyamal Kr. Das Mandal","doi":"10.1109/ICSDA.2009.5278373","DOIUrl":"https://doi.org/10.1109/ICSDA.2009.5278373","url":null,"abstract":"The automatic conversion of text to phoneme is a necessary step in all-current approaches to Text-to-Speech (TTS) synthesis and Automatic Speech Recognition System. This paper presents a methodology for Grapheme to Phoneme (G2P) conversion for Bangla based on orthographic rules. In Bangla G2P conversion sometimes depends not only on orthographic information but also on Parts of Speech (POS) information and semantics. This paper also addresses these issues along with their implementation methodology. The G2P conversion system of Bangla is tested on 1000 different types of Bangla sentences containing 9294 words. The percentage of correct conversion is 91.58% without considering the semantics and contextual POS with the exception table size of 333 words. If those errors which occur due to lack of exceptional words are considered, then the percentage of correct conversion will increase to 98%.","PeriodicalId":254906,"journal":{"name":"2009 Oriental COCOSDA International Conference on Speech Database and Assessments","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116564128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 26
Speech processing technology of Uyghur language 维吾尔语语音处理技术
2009 Oriental COCOSDA International Conference on Speech Database and Assessments Pub Date : 2009-10-02 DOI: 10.1109/ICSDA.2009.5278381
Wushouer Silamu, Nasirjan Tursun, Parida Saltiniyaz
{"title":"Speech processing technology of Uyghur language","authors":"Wushouer Silamu, Nasirjan Tursun, Parida Saltiniyaz","doi":"10.1109/ICSDA.2009.5278381","DOIUrl":"https://doi.org/10.1109/ICSDA.2009.5278381","url":null,"abstract":"In recent years, there have been a significant number of activities in the area of Uyghur speech processing. This paper summarized the major tasks and key technologies in Uyghur speech processing, which including the speech database, continuous speech recognition and speech synthesis. Uyghur language is one of the least studied languages on speech processing area. For this reason, in our work, the first step was collecting large amount continuous speech data of Uyghur language. In Uyghur continuous speech recognition, we were building the HMM state models for each recognition unit, and were using the recognizer of HTK3.3 (HMM ToolKit) and the MS Visual C++8.0 developing the basic Uyghur Continuous Speech Recognition System. In Uyghur speech synthesis, we were designing and developing an intelligible and natural sounding corpus-based speech synthesis system.","PeriodicalId":254906,"journal":{"name":"2009 Oriental COCOSDA International Conference on Speech Database and Assessments","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125172567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Design and development of phonetically rich Urdu speech corpus 语音丰富的乌尔都语语料库的设计与开发
2009 Oriental COCOSDA International Conference on Speech Database and Assessments Pub Date : 2009-10-02 DOI: 10.1109/ICSDA.2009.5278380
Agha Ali Raza, S. Hussain, Huda Sarfraz, Inam Ullah, Z. Sarfraz
{"title":"Design and development of phonetically rich Urdu speech corpus","authors":"Agha Ali Raza, S. Hussain, Huda Sarfraz, Inam Ullah, Z. Sarfraz","doi":"10.1109/ICSDA.2009.5278380","DOIUrl":"https://doi.org/10.1109/ICSDA.2009.5278380","url":null,"abstract":"Phonetically rich speech corpora play a pivotal role in speech research. The significance of such resources becomes crucial in the development of Automatic Speech Recognition systems and Text to Speech systems. This paper presents details of designing and developing an optimal context based phonetically rich speech corpus for Urdu that will serve as a baseline model for training a Large Vocabulary Continuous Speech Recognition system for Urdu language.","PeriodicalId":254906,"journal":{"name":"2009 Oriental COCOSDA International Conference on Speech Database and Assessments","volume":"201202 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116486887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 33
Multi-speaker adaptation for robust speech recognition under ubiquitous environment 泛在环境下多说话人自适应鲁棒语音识别
2009 Oriental COCOSDA International Conference on Speech Database and Assessments Pub Date : 2009-08-01 DOI: 10.1109/ICSDA.2009.5278364
Po-Yi Shih, Jhing-Fa Wang, Yuan-Ning Lin, Zhonghua Fu
{"title":"Multi-speaker adaptation for robust speech recognition under ubiquitous environment","authors":"Po-Yi Shih, Jhing-Fa Wang, Yuan-Ning Lin, Zhonghua Fu","doi":"10.1109/ICSDA.2009.5278364","DOIUrl":"https://doi.org/10.1109/ICSDA.2009.5278364","url":null,"abstract":"This paper presents a multi-speaker adaptation for robust speech recognition under ubiquitous environment. The goal is to adapt the speech recognition model for each speaker correctly in ubiquitous multi-speaker environment. We integrate speaker recognition and unsupervised speaker adaptation method to promote the speech recognition performances. Specifically we employ a confidence measure to reduce the possible negative adaptation caused by the environment noise or the recognition errors. The experimental results show that the proposed framework can efficiently promote the average recognition accuracy to 80∼90% for multi-speaker ubiquitous speech recognition.","PeriodicalId":254906,"journal":{"name":"2009 Oriental COCOSDA International Conference on Speech Database and Assessments","volume":"82 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127721234","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信