{"title":"基于自适应加权和中值滤波的音高滞后搜索方法","authors":"P. Ojala, P. Haavisto, A. Lakaniemi, J. Vainio","doi":"10.1109/SCFT.1999.781502","DOIUrl":null,"url":null,"abstract":"This paper presents a novel method to estimate the pitch-lag in a speech codec. The pitch-lag is related to the fundamental frequency of the speech signal and an accurate estimation of this parameter is important for the subjective quality of the synthesised speech. A common problem in speech codecs is that the estimation of the pitch-lag often produces a multiple or a sub-multiple of the true pitch value. When these incorrect pitch-lag values are used in speech synthesis the subjective quality of the speech is degraded. This paper presents an improved method where the estimation of the pitch-lag parameter is biased towards the pitch-lag values of the previous speech segments resulting in a consistent set of consecutive pitch-lag values and a high quality reconstructed signal. The classification of speech into voiced and unvoiced parts is used when tracking the pitch-lag values and adapting the pitch track centered weighting function.","PeriodicalId":372569,"journal":{"name":"1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1999-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"A novel pitch-lag search method using adaptive weighting and median filtering\",\"authors\":\"P. Ojala, P. Haavisto, A. Lakaniemi, J. Vainio\",\"doi\":\"10.1109/SCFT.1999.781502\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents a novel method to estimate the pitch-lag in a speech codec. The pitch-lag is related to the fundamental frequency of the speech signal and an accurate estimation of this parameter is important for the subjective quality of the synthesised speech. A common problem in speech codecs is that the estimation of the pitch-lag often produces a multiple or a sub-multiple of the true pitch value. When these incorrect pitch-lag values are used in speech synthesis the subjective quality of the speech is degraded. This paper presents an improved method where the estimation of the pitch-lag parameter is biased towards the pitch-lag values of the previous speech segments resulting in a consistent set of consecutive pitch-lag values and a high quality reconstructed signal. The classification of speech into voiced and unvoiced parts is used when tracking the pitch-lag values and adapting the pitch track centered weighting function.\",\"PeriodicalId\":372569,\"journal\":{\"name\":\"1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)\",\"volume\":\"22 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1999-06-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SCFT.1999.781502\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SCFT.1999.781502","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A novel pitch-lag search method using adaptive weighting and median filtering
This paper presents a novel method to estimate the pitch-lag in a speech codec. The pitch-lag is related to the fundamental frequency of the speech signal and an accurate estimation of this parameter is important for the subjective quality of the synthesised speech. A common problem in speech codecs is that the estimation of the pitch-lag often produces a multiple or a sub-multiple of the true pitch value. When these incorrect pitch-lag values are used in speech synthesis the subjective quality of the speech is degraded. This paper presents an improved method where the estimation of the pitch-lag parameter is biased towards the pitch-lag values of the previous speech segments resulting in a consistent set of consecutive pitch-lag values and a high quality reconstructed signal. The classification of speech into voiced and unvoiced parts is used when tracking the pitch-lag values and adapting the pitch track centered weighting function.