基于深度神经网络的阿萨姆语文本转语音系统的开发

2019 National Conference on Communications (NCC) Pub Date : 2019-02-01 DOI:10.1109/NCC.2019.8732262

A. Deka, Priyankoo Sarmah, K. Samudravijaya, S. Prasanna

{"title":"基于深度神经网络的阿萨姆语文本转语音系统的开发","authors":"A. Deka, Priyankoo Sarmah, K. Samudravijaya, S. Prasanna","doi":"10.1109/NCC.2019.8732262","DOIUrl":null,"url":null,"abstract":"This paper describes the development of a text-to-speech system for Assamese language, using Deep Neural Network (DNN). The system is trained with speech data, collected by a consortium, that is available free of cost for academic use. The DNN based method eliminates the need for a grapheme to phoneme conversion; rather, it synthesizes speech directly from the UTF-8 based Assamese script. The results of objective and subjective evaluations confirm that the Assamese speech synthesized using DNN approach is better than the ones synthesized using the traditional hidden Markov model based text-to-speech system.","PeriodicalId":6870,"journal":{"name":"2019 National Conference on Communications (NCC)","volume":"181 1","pages":"1-5"},"PeriodicalIF":0.0000,"publicationDate":"2019-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Development of Assamese Text-to-speech System using Deep Neural Network\",\"authors\":\"A. Deka, Priyankoo Sarmah, K. Samudravijaya, S. Prasanna\",\"doi\":\"10.1109/NCC.2019.8732262\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper describes the development of a text-to-speech system for Assamese language, using Deep Neural Network (DNN). The system is trained with speech data, collected by a consortium, that is available free of cost for academic use. The DNN based method eliminates the need for a grapheme to phoneme conversion; rather, it synthesizes speech directly from the UTF-8 based Assamese script. The results of objective and subjective evaluations confirm that the Assamese speech synthesized using DNN approach is better than the ones synthesized using the traditional hidden Markov model based text-to-speech system.\",\"PeriodicalId\":6870,\"journal\":{\"name\":\"2019 National Conference on Communications (NCC)\",\"volume\":\"181 1\",\"pages\":\"1-5\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-02-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 National Conference on Communications (NCC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/NCC.2019.8732262\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 National Conference on Communications (NCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NCC.2019.8732262","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 8

摘要

本文介绍了基于深度神经网络(DNN)的阿萨姆语文本转语音系统的开发。该系统使用由一个联盟收集的语音数据进行训练，这些数据可免费用于学术用途。基于深度神经网络的方法消除了字素到音素转换的需要;相反，它直接从基于UTF-8的阿萨姆语脚本合成语音。客观和主观评价的结果证实，使用DNN方法合成的阿萨姆语语音优于使用传统的基于隐马尔可夫模型的文本到语音系统合成的阿萨姆语语音。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Development of Assamese Text-to-speech System using Deep Neural Network

This paper describes the development of a text-to-speech system for Assamese language, using Deep Neural Network (DNN). The system is trained with speech data, collected by a consortium, that is available free of cost for academic use. The DNN based method eliminates the need for a grapheme to phoneme conversion; rather, it synthesizes speech directly from the UTF-8 based Assamese script. The results of objective and subjective evaluations confirm that the Assamese speech synthesized using DNN approach is better than the ones synthesized using the traditional hidden Markov model based text-to-speech system.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2019 National Conference on Communications (NCC)

自引率

0.00%

发文量