C. Hansakunbuntheung, A. Thangthai, N. Thatphithakkul, Altangerel Chagnaa
{"title":"Mongolian speech corpus for text-to-speech development","authors":"C. Hansakunbuntheung, A. Thangthai, N. Thatphithakkul, Altangerel Chagnaa","doi":"10.1109/ICSDA.2011.6085994","DOIUrl":null,"url":null,"abstract":"This paper presents a first attempt to develop Mongolian speech corpus that designed for data-driven speech synthesis in Mongolia. The aim of the speech corpus is to develop a high-quality Mongolian TTS for blinds to use with screen reader. The speech corpus contains nearly 6 hours of Mongolian phones. It well provides Cyrillic text transcription and its phonetic transcription with stress marking. It also provides context information including phone context, stressing levels, syntactic position in word, phrase and utterance for modeling speech acoustics and characteristics for speech synthesis.","PeriodicalId":269402,"journal":{"name":"2011 International Conference on Speech Database and Assessments (Oriental COCOSDA)","volume":"61 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 International Conference on Speech Database and Assessments (Oriental COCOSDA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSDA.2011.6085994","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
This paper presents a first attempt to develop Mongolian speech corpus that designed for data-driven speech synthesis in Mongolia. The aim of the speech corpus is to develop a high-quality Mongolian TTS for blinds to use with screen reader. The speech corpus contains nearly 6 hours of Mongolian phones. It well provides Cyrillic text transcription and its phonetic transcription with stress marking. It also provides context information including phone context, stressing levels, syntactic position in word, phrase and utterance for modeling speech acoustics and characteristics for speech synthesis.