Chiranjeevi Yarra, Ritu Aggarwal, Avni Rajpal, P. Ghosh
{"title":"Indic TIMIT and Indic English lexicon: A speech database of Indian speakers using TIMIT stimuli and a lexicon from their mispronunciations","authors":"Chiranjeevi Yarra, Ritu Aggarwal, Avni Rajpal, P. Ghosh","doi":"10.1109/O-COCOSDA46868.2019.9041230","DOIUrl":null,"url":null,"abstract":"With the advancements in the speech technology, demand for larger speech corpora is increasing particularly those from non-native English speakers. In order to cater to this demand under Indian context, we acquire a database named Indic TIMIT, a phonetically rich Indian English speech corpus. It contains ~240 hours of speech recordings from 80 subjects, in which, each subject has spoken a set of 2342 stimuli available in the TIMIT corpus. Further, the corpus also contains phoneme transcriptions for a sub-set of recordings, which are manually annotated by two linguists reflecting speaker's pronunciation. Considering these, Indic TIMIT is unique with respect to the existing corpora that are available in Indian context. Along with Indic TIMIT, a lexicon named Indic English lexicon is provided, which is constructed by incorporating pronunciation variations specific to Indians obtained from their errors to the existing word pronunciations in a native English lexicon. In this paper, the effectiveness of Indic TIMIT and Indic English lexicon is shown respectively in comparison with the data from TIMIT and a lexicon augmented with all the word pronunciations from CMU, Beep and the lexicon available in the TIMIT corpus. Indic TIMIT and Indic English lexicon could be useful for a number of potential applications in Indian context including automatic speech recognition, mispronunciation detection & diagnosis, native language identification, accent adaptation, accent conversion, voice conversion, speech synthesis, grapheme-to-phoneme conversion, automatic phoneme unit discovery and pronunciation error analysis.","PeriodicalId":263209,"journal":{"name":"2019 22nd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 22nd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/O-COCOSDA46868.2019.9041230","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11
Abstract
With the advancements in the speech technology, demand for larger speech corpora is increasing particularly those from non-native English speakers. In order to cater to this demand under Indian context, we acquire a database named Indic TIMIT, a phonetically rich Indian English speech corpus. It contains ~240 hours of speech recordings from 80 subjects, in which, each subject has spoken a set of 2342 stimuli available in the TIMIT corpus. Further, the corpus also contains phoneme transcriptions for a sub-set of recordings, which are manually annotated by two linguists reflecting speaker's pronunciation. Considering these, Indic TIMIT is unique with respect to the existing corpora that are available in Indian context. Along with Indic TIMIT, a lexicon named Indic English lexicon is provided, which is constructed by incorporating pronunciation variations specific to Indians obtained from their errors to the existing word pronunciations in a native English lexicon. In this paper, the effectiveness of Indic TIMIT and Indic English lexicon is shown respectively in comparison with the data from TIMIT and a lexicon augmented with all the word pronunciations from CMU, Beep and the lexicon available in the TIMIT corpus. Indic TIMIT and Indic English lexicon could be useful for a number of potential applications in Indian context including automatic speech recognition, mispronunciation detection & diagnosis, native language identification, accent adaptation, accent conversion, voice conversion, speech synthesis, grapheme-to-phoneme conversion, automatic phoneme unit discovery and pronunciation error analysis.