Pramod B. Bachhav, M. Todisco, M. M. Idrissa, C. Beaugeant, N. Evans
{"title":"使用常数Q变换的人工带宽扩展","authors":"Pramod B. Bachhav, M. Todisco, M. M. Idrissa, C. Beaugeant, N. Evans","doi":"10.1109/ICASSP.2017.7953218","DOIUrl":null,"url":null,"abstract":"Most artificial bandwidth extension (ABE) algorithms are based on the classical source-filter model of speech production. This approach generally requires the dual extension of each component through independent processing. Alternative approaches reported recently operate on the spectrum. With human perception thought to be largely insensitive to phase, most such approaches focus on the extension of the magnitude spectrum alone and rely on Fourier spectral analysis. This paper reports an approach to ABE based on the constant Q transform (CQT), a more perceptually motivated approach to spectral analysis. A Gaussian mixture model is used to estimate missing highband components from available narrowband components before resynthesis with phase estimates obtained from the upsampled narrowband signal. Objective assessment shows that energy normalisation is critical to performance. These findings and the appeal of CQT for ABE are confirmed through informal subjective tests based on the mean opinion score.","PeriodicalId":118243,"journal":{"name":"2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":"{\"title\":\"Artificial bandwidth extension using the constant Q transform\",\"authors\":\"Pramod B. Bachhav, M. Todisco, M. M. Idrissa, C. Beaugeant, N. Evans\",\"doi\":\"10.1109/ICASSP.2017.7953218\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Most artificial bandwidth extension (ABE) algorithms are based on the classical source-filter model of speech production. This approach generally requires the dual extension of each component through independent processing. Alternative approaches reported recently operate on the spectrum. With human perception thought to be largely insensitive to phase, most such approaches focus on the extension of the magnitude spectrum alone and rely on Fourier spectral analysis. This paper reports an approach to ABE based on the constant Q transform (CQT), a more perceptually motivated approach to spectral analysis. A Gaussian mixture model is used to estimate missing highband components from available narrowband components before resynthesis with phase estimates obtained from the upsampled narrowband signal. Objective assessment shows that energy normalisation is critical to performance. These findings and the appeal of CQT for ABE are confirmed through informal subjective tests based on the mean opinion score.\",\"PeriodicalId\":118243,\"journal\":{\"name\":\"2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-08-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"14\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICASSP.2017.7953218\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASSP.2017.7953218","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Artificial bandwidth extension using the constant Q transform
Most artificial bandwidth extension (ABE) algorithms are based on the classical source-filter model of speech production. This approach generally requires the dual extension of each component through independent processing. Alternative approaches reported recently operate on the spectrum. With human perception thought to be largely insensitive to phase, most such approaches focus on the extension of the magnitude spectrum alone and rely on Fourier spectral analysis. This paper reports an approach to ABE based on the constant Q transform (CQT), a more perceptually motivated approach to spectral analysis. A Gaussian mixture model is used to estimate missing highband components from available narrowband components before resynthesis with phase estimates obtained from the upsampled narrowband signal. Objective assessment shows that energy normalisation is critical to performance. These findings and the appeal of CQT for ABE are confirmed through informal subjective tests based on the mean opinion score.