{"title":"Prosodic Processing for the Automatic Synthesis of Emotional Russian Speech","authors":"A. Kaliyev, Yuri N. Matveev, E. Lyakso, S. Rybin","doi":"10.1109/ITMQIS.2018.8525072","DOIUrl":null,"url":null,"abstract":"Currently, the automatic speech synthesis technology is undergoing significant changes due to new solutions in the field of machine learning. These solutions qualitatively improve the sound of synthesized speech, bringing it closer to natural human speech. Against the backdrop of this, as well as under the influence of business, the development of artificial emotional speech for human-machine interaction systems has received a new strong turn of development. Due to this prosodic processing for the synthesis of Russian emotional speech has become an important research direction for our research group.The article presents an algorithm for predicting pause locations for three categories of emotional speech. In particular, the authors used three corpora of emotional speech, collected according to emotional categories (neutral, excited and depressed), for training classifiers. The obtained results can be used to create a high-quality automatic synthesizer of emotional speech.","PeriodicalId":133622,"journal":{"name":"2018 IEEE International Conference \"Quality Management, Transport and Information Security, Information Technologies\" (IT&QM&IS)","volume":"26 2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE International Conference \"Quality Management, Transport and Information Security, Information Technologies\" (IT&QM&IS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ITMQIS.2018.8525072","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
Currently, the automatic speech synthesis technology is undergoing significant changes due to new solutions in the field of machine learning. These solutions qualitatively improve the sound of synthesized speech, bringing it closer to natural human speech. Against the backdrop of this, as well as under the influence of business, the development of artificial emotional speech for human-machine interaction systems has received a new strong turn of development. Due to this prosodic processing for the synthesis of Russian emotional speech has become an important research direction for our research group.The article presents an algorithm for predicting pause locations for three categories of emotional speech. In particular, the authors used three corpora of emotional speech, collected according to emotional categories (neutral, excited and depressed), for training classifiers. The obtained results can be used to create a high-quality automatic synthesizer of emotional speech.