{"title":"多频带激励和正弦语音编码器的联合基音和语音估计","authors":"Wenhui Jia, W. Chan","doi":"10.1109/ACSSC.2002.1197178","DOIUrl":null,"url":null,"abstract":"In conventional multi-band excitation (MBE) speech encoding, pitch is estimated first from the speech signal. Using the estimated pitch, voicing decisions are made for pitch-spaced spectral bands. As the method invariably includes unvoiced components in the speech signal to estimate the pitch, the accuracy of the estimated pitch and voicing decisions are degraded. A novel pitch and voicing estimation scheme is presented, wherein the spectrum of the speech signal is segmented into voiced and unvoiced regions without knowledge of the pitch. Pitch is then estimated only from the voice regions. Experimental results show that the new scheme improves the accuracy of the estimated pitch and voicing decisions, and offers better speech quality.","PeriodicalId":284950,"journal":{"name":"Conference Record of the Thirty-Sixth Asilomar Conference on Signals, Systems and Computers, 2002.","volume":"27 2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Joint pitch and voicing estimation for multiband excitation and sinusoidal speech coders\",\"authors\":\"Wenhui Jia, W. Chan\",\"doi\":\"10.1109/ACSSC.2002.1197178\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In conventional multi-band excitation (MBE) speech encoding, pitch is estimated first from the speech signal. Using the estimated pitch, voicing decisions are made for pitch-spaced spectral bands. As the method invariably includes unvoiced components in the speech signal to estimate the pitch, the accuracy of the estimated pitch and voicing decisions are degraded. A novel pitch and voicing estimation scheme is presented, wherein the spectrum of the speech signal is segmented into voiced and unvoiced regions without knowledge of the pitch. Pitch is then estimated only from the voice regions. Experimental results show that the new scheme improves the accuracy of the estimated pitch and voicing decisions, and offers better speech quality.\",\"PeriodicalId\":284950,\"journal\":{\"name\":\"Conference Record of the Thirty-Sixth Asilomar Conference on Signals, Systems and Computers, 2002.\",\"volume\":\"27 2 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2002-11-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Conference Record of the Thirty-Sixth Asilomar Conference on Signals, Systems and Computers, 2002.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ACSSC.2002.1197178\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Conference Record of the Thirty-Sixth Asilomar Conference on Signals, Systems and Computers, 2002.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ACSSC.2002.1197178","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Joint pitch and voicing estimation for multiband excitation and sinusoidal speech coders
In conventional multi-band excitation (MBE) speech encoding, pitch is estimated first from the speech signal. Using the estimated pitch, voicing decisions are made for pitch-spaced spectral bands. As the method invariably includes unvoiced components in the speech signal to estimate the pitch, the accuracy of the estimated pitch and voicing decisions are degraded. A novel pitch and voicing estimation scheme is presented, wherein the spectrum of the speech signal is segmented into voiced and unvoiced regions without knowledge of the pitch. Pitch is then estimated only from the voice regions. Experimental results show that the new scheme improves the accuracy of the estimated pitch and voicing decisions, and offers better speech quality.