Sanket Barhate, S. Kshirsagar, Niramay Sanghvi, Kamini Sabu, P. Rao, N. Bondale
{"title":"马拉地语新闻阅读风格的韵律特征","authors":"Sanket Barhate, S. Kshirsagar, Niramay Sanghvi, Kamini Sabu, P. Rao, N. Bondale","doi":"10.1109/TENCON.2016.7848421","DOIUrl":null,"url":null,"abstract":"Text-to-speech synthesizers present an attractive alternative to reading in hands-free communication scenarios. Speech intelligibility and naturalness are key to the user acceptability of synthesized speech. The accurate modeling of prosody plays an important role in both dimensions. While prosody is language dependent, it is also strongly dependent on the speaking style. In this work, we study the important prosodic features of news reading style in Marathi using publicly available radio broadcasts. Prominence and boundaries are among the important linguistic cues conveyed via a news reader's prosody. Using perception testing, we obtain boundaries and prominent words in broadcast recordings of two female news readers. We measure acoustic parameters known to serve as cues to prominence such as the fundamental frequency, duration and intensity. We also make observations on timing and pitch phenomena at inter- and intra-sentence breaks. Our results indicate that prominence depends strongly on achieved FO span in the word and to a smaller extent on duration increase. Breaks are signaled by pauses and pre-boundary lengthening of the final syllable. We observe that, unlike English, sentence ending in Marathi is not always accompanied by a pitch fall in the final syllable. The implications of these observations on prosody generation are discussed.","PeriodicalId":246458,"journal":{"name":"2016 IEEE Region 10 Conference (TENCON)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Prosodic features of Marathi news reading style\",\"authors\":\"Sanket Barhate, S. Kshirsagar, Niramay Sanghvi, Kamini Sabu, P. Rao, N. Bondale\",\"doi\":\"10.1109/TENCON.2016.7848421\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Text-to-speech synthesizers present an attractive alternative to reading in hands-free communication scenarios. Speech intelligibility and naturalness are key to the user acceptability of synthesized speech. The accurate modeling of prosody plays an important role in both dimensions. While prosody is language dependent, it is also strongly dependent on the speaking style. In this work, we study the important prosodic features of news reading style in Marathi using publicly available radio broadcasts. Prominence and boundaries are among the important linguistic cues conveyed via a news reader's prosody. Using perception testing, we obtain boundaries and prominent words in broadcast recordings of two female news readers. We measure acoustic parameters known to serve as cues to prominence such as the fundamental frequency, duration and intensity. We also make observations on timing and pitch phenomena at inter- and intra-sentence breaks. Our results indicate that prominence depends strongly on achieved FO span in the word and to a smaller extent on duration increase. Breaks are signaled by pauses and pre-boundary lengthening of the final syllable. We observe that, unlike English, sentence ending in Marathi is not always accompanied by a pitch fall in the final syllable. The implications of these observations on prosody generation are discussed.\",\"PeriodicalId\":246458,\"journal\":{\"name\":\"2016 IEEE Region 10 Conference (TENCON)\",\"volume\":\"26 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE Region 10 Conference (TENCON)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/TENCON.2016.7848421\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE Region 10 Conference (TENCON)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TENCON.2016.7848421","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Text-to-speech synthesizers present an attractive alternative to reading in hands-free communication scenarios. Speech intelligibility and naturalness are key to the user acceptability of synthesized speech. The accurate modeling of prosody plays an important role in both dimensions. While prosody is language dependent, it is also strongly dependent on the speaking style. In this work, we study the important prosodic features of news reading style in Marathi using publicly available radio broadcasts. Prominence and boundaries are among the important linguistic cues conveyed via a news reader's prosody. Using perception testing, we obtain boundaries and prominent words in broadcast recordings of two female news readers. We measure acoustic parameters known to serve as cues to prominence such as the fundamental frequency, duration and intensity. We also make observations on timing and pitch phenomena at inter- and intra-sentence breaks. Our results indicate that prominence depends strongly on achieved FO span in the word and to a smaller extent on duration increase. Breaks are signaled by pauses and pre-boundary lengthening of the final syllable. We observe that, unlike English, sentence ending in Marathi is not always accompanied by a pitch fall in the final syllable. The implications of these observations on prosody generation are discussed.