Shinnosuke Takamichi, T. Toda, Graham Neubig, S. Sakti, Satoshi Nakamura
{"title":"在基于hmm的语音合成中修改调制频谱的后滤波器","authors":"Shinnosuke Takamichi, T. Toda, Graham Neubig, S. Sakti, Satoshi Nakamura","doi":"10.1109/ICASSP.2014.6853604","DOIUrl":null,"url":null,"abstract":"In this paper, we propose a postfilter to compensate modulation spectrum in HMM-based speech synthesis. In order to alleviate over-smoothing effects which is a main cause of quality degradation in HMM-based speech synthesis, it is necessary to consider features that can capture over-smoothing. Global Variance (GV) is one well-known example of such a feature, and the effectiveness of parameter generation algorithm considering GV have been confirmed. However, the quality gap between natural speech and synthetic speech is still large. In this paper, we introduce the Modulation Spectrum (MS) of speech parameter trajectory as a new feature to effectively capture the over-smoothing effect, and we propose a postfilter based on the MS. The MS is represented as a power spectrum of the parameter trajectory. The generated speech parameter sequence is filtered to ensure that its MS has a pattern similar to natural speech. Experimental results show quality improvements when the proposed methods are applied to spectral and F0 components, compared with conventional methods considering GV.","PeriodicalId":6545,"journal":{"name":"2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"27 1","pages":"290-294"},"PeriodicalIF":0.0000,"publicationDate":"2014-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"76","resultStr":"{\"title\":\"A postfilter to modify the modulation spectrum in HMM-based speech synthesis\",\"authors\":\"Shinnosuke Takamichi, T. Toda, Graham Neubig, S. Sakti, Satoshi Nakamura\",\"doi\":\"10.1109/ICASSP.2014.6853604\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we propose a postfilter to compensate modulation spectrum in HMM-based speech synthesis. In order to alleviate over-smoothing effects which is a main cause of quality degradation in HMM-based speech synthesis, it is necessary to consider features that can capture over-smoothing. Global Variance (GV) is one well-known example of such a feature, and the effectiveness of parameter generation algorithm considering GV have been confirmed. However, the quality gap between natural speech and synthetic speech is still large. In this paper, we introduce the Modulation Spectrum (MS) of speech parameter trajectory as a new feature to effectively capture the over-smoothing effect, and we propose a postfilter based on the MS. The MS is represented as a power spectrum of the parameter trajectory. The generated speech parameter sequence is filtered to ensure that its MS has a pattern similar to natural speech. Experimental results show quality improvements when the proposed methods are applied to spectral and F0 components, compared with conventional methods considering GV.\",\"PeriodicalId\":6545,\"journal\":{\"name\":\"2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)\",\"volume\":\"27 1\",\"pages\":\"290-294\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-05-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"76\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICASSP.2014.6853604\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASSP.2014.6853604","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A postfilter to modify the modulation spectrum in HMM-based speech synthesis
In this paper, we propose a postfilter to compensate modulation spectrum in HMM-based speech synthesis. In order to alleviate over-smoothing effects which is a main cause of quality degradation in HMM-based speech synthesis, it is necessary to consider features that can capture over-smoothing. Global Variance (GV) is one well-known example of such a feature, and the effectiveness of parameter generation algorithm considering GV have been confirmed. However, the quality gap between natural speech and synthetic speech is still large. In this paper, we introduce the Modulation Spectrum (MS) of speech parameter trajectory as a new feature to effectively capture the over-smoothing effect, and we propose a postfilter based on the MS. The MS is represented as a power spectrum of the parameter trajectory. The generated speech parameter sequence is filtered to ensure that its MS has a pattern similar to natural speech. Experimental results show quality improvements when the proposed methods are applied to spectral and F0 components, compared with conventional methods considering GV.