{"title":"印地语多阶段儿童故事语音合成","authors":"M. HarikrishnaD., M. GurunathReddy, K. S. Rao","doi":"10.1109/IC3.2015.7346682","DOIUrl":null,"url":null,"abstract":"In this paper, we propose a multi-stage children story speech synthesis system for Hindi language. The proposed system performs the following tasks: (i) classification of stories into different genres based on text, (ii) prediction of emotion from story text, (iii) deriving prosody rules (modification factors) specific to emotions and story genres and (iv) synthesis of story speech using mark-up language and prosody modification factors. Keyword and part-of-speech (POS) features are used for story-genre classification and emotion prediction. The prosody modification factors are derived carefully by analyzing the perceptual differences between synthesized neutral speech utterances and their respective utterances narrated by a storyteller. The story is synthesized by the festival based concatenative speech synthesizer with annotated story in the form of SABLE mark-up language. The quality and naturalness of the synthesized story speech is evaluated using subjective tests.","PeriodicalId":217950,"journal":{"name":"2015 Eighth International Conference on Contemporary Computing (IC3)","volume":"114 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Multi-stage children story speech synthesis for Hindi\",\"authors\":\"M. HarikrishnaD., M. GurunathReddy, K. S. Rao\",\"doi\":\"10.1109/IC3.2015.7346682\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we propose a multi-stage children story speech synthesis system for Hindi language. The proposed system performs the following tasks: (i) classification of stories into different genres based on text, (ii) prediction of emotion from story text, (iii) deriving prosody rules (modification factors) specific to emotions and story genres and (iv) synthesis of story speech using mark-up language and prosody modification factors. Keyword and part-of-speech (POS) features are used for story-genre classification and emotion prediction. The prosody modification factors are derived carefully by analyzing the perceptual differences between synthesized neutral speech utterances and their respective utterances narrated by a storyteller. The story is synthesized by the festival based concatenative speech synthesizer with annotated story in the form of SABLE mark-up language. The quality and naturalness of the synthesized story speech is evaluated using subjective tests.\",\"PeriodicalId\":217950,\"journal\":{\"name\":\"2015 Eighth International Conference on Contemporary Computing (IC3)\",\"volume\":\"114 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-08-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 Eighth International Conference on Contemporary Computing (IC3)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IC3.2015.7346682\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 Eighth International Conference on Contemporary Computing (IC3)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IC3.2015.7346682","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Multi-stage children story speech synthesis for Hindi
In this paper, we propose a multi-stage children story speech synthesis system for Hindi language. The proposed system performs the following tasks: (i) classification of stories into different genres based on text, (ii) prediction of emotion from story text, (iii) deriving prosody rules (modification factors) specific to emotions and story genres and (iv) synthesis of story speech using mark-up language and prosody modification factors. Keyword and part-of-speech (POS) features are used for story-genre classification and emotion prediction. The prosody modification factors are derived carefully by analyzing the perceptual differences between synthesized neutral speech utterances and their respective utterances narrated by a storyteller. The story is synthesized by the festival based concatenative speech synthesizer with annotated story in the form of SABLE mark-up language. The quality and naturalness of the synthesized story speech is evaluated using subjective tests.