Hideki Kawahara, M. Morise, T. Toda, Hideki Banno, R. Nisimura, T. Irino
{"title":"基于周期信号的时间静态群延迟表示的高质量语音处理系统的激励源设计","authors":"Hideki Kawahara, M. Morise, T. Toda, Hideki Banno, R. Nisimura, T. Irino","doi":"10.1109/APSIPA.2014.7041594","DOIUrl":null,"url":null,"abstract":"A new group delay representation, which yields value zero for periodic signals irrespective to the initial phase and the relative level of each harmonic component. This new group delay representation provides a unified basis for defining \"aperiodicity\" in speech sounds. For example, the periodic to noise ratio or harmonic to noise ratio is directly derived from the deviation of this group delay representation from value zero, after removing FM effects of harmonic frequencies and removing AM effects of harmonic component level. The derived deviation is combined with estimated excitation duration information and used to design aperiodic components of excitation source for high-quality synthetic speech. The proposed group delay representation is based on FO-adaptive weighted average of frequency shifted versions and temporally shifted versions of group delays with power spectral weighting.","PeriodicalId":231382,"journal":{"name":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","volume":"122 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Excitation source design for high-quality speech manipulation systems based on a temporally static group delay representation of periodic signals\",\"authors\":\"Hideki Kawahara, M. Morise, T. Toda, Hideki Banno, R. Nisimura, T. Irino\",\"doi\":\"10.1109/APSIPA.2014.7041594\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A new group delay representation, which yields value zero for periodic signals irrespective to the initial phase and the relative level of each harmonic component. This new group delay representation provides a unified basis for defining \\\"aperiodicity\\\" in speech sounds. For example, the periodic to noise ratio or harmonic to noise ratio is directly derived from the deviation of this group delay representation from value zero, after removing FM effects of harmonic frequencies and removing AM effects of harmonic component level. The derived deviation is combined with estimated excitation duration information and used to design aperiodic components of excitation source for high-quality synthetic speech. The proposed group delay representation is based on FO-adaptive weighted average of frequency shifted versions and temporally shifted versions of group delays with power spectral weighting.\",\"PeriodicalId\":231382,\"journal\":{\"name\":\"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific\",\"volume\":\"122 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/APSIPA.2014.7041594\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/APSIPA.2014.7041594","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Excitation source design for high-quality speech manipulation systems based on a temporally static group delay representation of periodic signals
A new group delay representation, which yields value zero for periodic signals irrespective to the initial phase and the relative level of each harmonic component. This new group delay representation provides a unified basis for defining "aperiodicity" in speech sounds. For example, the periodic to noise ratio or harmonic to noise ratio is directly derived from the deviation of this group delay representation from value zero, after removing FM effects of harmonic frequencies and removing AM effects of harmonic component level. The derived deviation is combined with estimated excitation duration information and used to design aperiodic components of excitation source for high-quality synthetic speech. The proposed group delay representation is based on FO-adaptive weighted average of frequency shifted versions and temporally shifted versions of group delays with power spectral weighting.