{"title":"Joint pitch and voicing estimation for multiband excitation and sinusoidal speech coders","authors":"Wenhui Jia, W. Chan","doi":"10.1109/ACSSC.2002.1197178","DOIUrl":null,"url":null,"abstract":"In conventional multi-band excitation (MBE) speech encoding, pitch is estimated first from the speech signal. Using the estimated pitch, voicing decisions are made for pitch-spaced spectral bands. As the method invariably includes unvoiced components in the speech signal to estimate the pitch, the accuracy of the estimated pitch and voicing decisions are degraded. A novel pitch and voicing estimation scheme is presented, wherein the spectrum of the speech signal is segmented into voiced and unvoiced regions without knowledge of the pitch. Pitch is then estimated only from the voice regions. Experimental results show that the new scheme improves the accuracy of the estimated pitch and voicing decisions, and offers better speech quality.","PeriodicalId":284950,"journal":{"name":"Conference Record of the Thirty-Sixth Asilomar Conference on Signals, Systems and Computers, 2002.","volume":"27 2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Conference Record of the Thirty-Sixth Asilomar Conference on Signals, Systems and Computers, 2002.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ACSSC.2002.1197178","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
In conventional multi-band excitation (MBE) speech encoding, pitch is estimated first from the speech signal. Using the estimated pitch, voicing decisions are made for pitch-spaced spectral bands. As the method invariably includes unvoiced components in the speech signal to estimate the pitch, the accuracy of the estimated pitch and voicing decisions are degraded. A novel pitch and voicing estimation scheme is presented, wherein the spectrum of the speech signal is segmented into voiced and unvoiced regions without knowledge of the pitch. Pitch is then estimated only from the voice regions. Experimental results show that the new scheme improves the accuracy of the estimated pitch and voicing decisions, and offers better speech quality.