{"title":"用于低速率语音编码器的谐波幅度矢量量化","authors":"P. Lupini, V. Cuperman","doi":"10.1109/GLOCOM.1994.512716","DOIUrl":null,"url":null,"abstract":"Several techniques for speech coding at rates of 4 kb/s and lower require quantization of spectral magnitudes at a set of frequencies which are harmonics of the fundamental pitch period of the talker (for example: multiband excitation coding, sinusoidal transform coding, and time-frequency interpolation). The number of harmonic magnitudes to be quantized depends on the fundamental frequency value and hence is variable, changing from frame to frame. The variable number of components to be quantized makes it difficult to use fixed-dimension vector quantization for harmonic magnitude encoding. In this paper, we introduce a quantization technique called non-square transform vector quantization (NSTVQ) which uses a fixed-dimension vector quantizer combined with a variable-size non-square transform which maps the variable-dimension harmonic magnitude vector into a fixed-dimension vector. The optimal reconstruction procedure for non-square transforms is derived and shown to be equivalent to an optimal least-square estimation procedure. The proposed technique is evaluated experimentally as part of a new coding system called spectral excitation coding (SEC). The results are compared to an existing technique which estimates the spectral shape using all-pole modeling followed by vector quantization of the LSP parameters.","PeriodicalId":323626,"journal":{"name":"1994 IEEE GLOBECOM. Communications: The Global Bridge","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1994-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"21","resultStr":"{\"title\":\"Vector quantization of harmonic magnitudes for low-rate speech coders\",\"authors\":\"P. Lupini, V. Cuperman\",\"doi\":\"10.1109/GLOCOM.1994.512716\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Several techniques for speech coding at rates of 4 kb/s and lower require quantization of spectral magnitudes at a set of frequencies which are harmonics of the fundamental pitch period of the talker (for example: multiband excitation coding, sinusoidal transform coding, and time-frequency interpolation). The number of harmonic magnitudes to be quantized depends on the fundamental frequency value and hence is variable, changing from frame to frame. The variable number of components to be quantized makes it difficult to use fixed-dimension vector quantization for harmonic magnitude encoding. In this paper, we introduce a quantization technique called non-square transform vector quantization (NSTVQ) which uses a fixed-dimension vector quantizer combined with a variable-size non-square transform which maps the variable-dimension harmonic magnitude vector into a fixed-dimension vector. The optimal reconstruction procedure for non-square transforms is derived and shown to be equivalent to an optimal least-square estimation procedure. The proposed technique is evaluated experimentally as part of a new coding system called spectral excitation coding (SEC). The results are compared to an existing technique which estimates the spectral shape using all-pole modeling followed by vector quantization of the LSP parameters.\",\"PeriodicalId\":323626,\"journal\":{\"name\":\"1994 IEEE GLOBECOM. Communications: The Global Bridge\",\"volume\":\"32 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1994-11-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"21\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"1994 IEEE GLOBECOM. Communications: The Global Bridge\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/GLOCOM.1994.512716\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"1994 IEEE GLOBECOM. Communications: The Global Bridge","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/GLOCOM.1994.512716","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Vector quantization of harmonic magnitudes for low-rate speech coders
Several techniques for speech coding at rates of 4 kb/s and lower require quantization of spectral magnitudes at a set of frequencies which are harmonics of the fundamental pitch period of the talker (for example: multiband excitation coding, sinusoidal transform coding, and time-frequency interpolation). The number of harmonic magnitudes to be quantized depends on the fundamental frequency value and hence is variable, changing from frame to frame. The variable number of components to be quantized makes it difficult to use fixed-dimension vector quantization for harmonic magnitude encoding. In this paper, we introduce a quantization technique called non-square transform vector quantization (NSTVQ) which uses a fixed-dimension vector quantizer combined with a variable-size non-square transform which maps the variable-dimension harmonic magnitude vector into a fixed-dimension vector. The optimal reconstruction procedure for non-square transforms is derived and shown to be equivalent to an optimal least-square estimation procedure. The proposed technique is evaluated experimentally as part of a new coding system called spectral excitation coding (SEC). The results are compared to an existing technique which estimates the spectral shape using all-pole modeling followed by vector quantization of the LSP parameters.