{"title":"基于频率倒谱系数矢量量化的语音带宽扩展","authors":"N. Enbom, W. Kleijn","doi":"10.1109/SCFT.1999.781521","DOIUrl":null,"url":null,"abstract":"Telephone speech is usually limited to less than 4 kHz in bandwidth. This bandwidth limitation results in the typical sound of telephone speech. We present a new method of regenerating the high frequencies (4-8 kHz) based on vector quantization of the mel-frequency cepstral coefficients (MFCC). We also present two methods to avoid perceptually annoying overestimates of the signal power in the high-band. Listening tests show the benefits of the new procedures. Use of MFCC for vector quantization instead of traditionally used spectral representations improves the quality of the speech significantly. Tests also show that the wide-band speech reconstructed with the method is significantly more pleasant to the human ear than the original narrowband speech.","PeriodicalId":372569,"journal":{"name":"1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1999-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"69","resultStr":"{\"title\":\"Bandwidth expansion of speech based on vector quantization of the mel frequency cepstral coefficients\",\"authors\":\"N. Enbom, W. Kleijn\",\"doi\":\"10.1109/SCFT.1999.781521\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Telephone speech is usually limited to less than 4 kHz in bandwidth. This bandwidth limitation results in the typical sound of telephone speech. We present a new method of regenerating the high frequencies (4-8 kHz) based on vector quantization of the mel-frequency cepstral coefficients (MFCC). We also present two methods to avoid perceptually annoying overestimates of the signal power in the high-band. Listening tests show the benefits of the new procedures. Use of MFCC for vector quantization instead of traditionally used spectral representations improves the quality of the speech significantly. Tests also show that the wide-band speech reconstructed with the method is significantly more pleasant to the human ear than the original narrowband speech.\",\"PeriodicalId\":372569,\"journal\":{\"name\":\"1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)\",\"volume\":\"16 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1999-06-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"69\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SCFT.1999.781521\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SCFT.1999.781521","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Bandwidth expansion of speech based on vector quantization of the mel frequency cepstral coefficients
Telephone speech is usually limited to less than 4 kHz in bandwidth. This bandwidth limitation results in the typical sound of telephone speech. We present a new method of regenerating the high frequencies (4-8 kHz) based on vector quantization of the mel-frequency cepstral coefficients (MFCC). We also present two methods to avoid perceptually annoying overestimates of the signal power in the high-band. Listening tests show the benefits of the new procedures. Use of MFCC for vector quantization instead of traditionally used spectral representations improves the quality of the speech significantly. Tests also show that the wide-band speech reconstructed with the method is significantly more pleasant to the human ear than the original narrowband speech.