{"title":"Robust Speech Coding for the Preservation of Speaker Identity","authors":"M. Phythian, J. Leis, S. Sridharan","doi":"10.1109/ISSPA.1996.615767","DOIUrl":null,"url":null,"abstract":"Low bitrate speech coding usually requires robustness to a wide range of speakers. The problem which we report on here is one where the compression rate must be maximized for the purposes of archival, but the compressed information must be available at a later date for the purposes of identifying a new speaker. The new speaker may or may not have been recorded in the archived database. As would be expected, the ability to identify a particular speaker when compared to the compressed speech information is impaired, in a manner which is related to the degree of compression. Furthermore, automatic speaker recognition algorithms depend upon a parameterization of the speech which may not be available in the quantity required in the compressed data 'stream. We present here our results in identifying a speaker using two common methods applied to the data stream resulting from a class of spectral vector compression algorithms. It is shown experimentally that a simplified. easily-computed distance metric algorithm is somewhat more sensitive to the compression process when compared to a substantially more complex multivariate statistical modelling method.","PeriodicalId":359344,"journal":{"name":"Fourth International Symposium on Signal Processing and Its Applications","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Fourth International Symposium on Signal Processing and Its Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISSPA.1996.615767","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Low bitrate speech coding usually requires robustness to a wide range of speakers. The problem which we report on here is one where the compression rate must be maximized for the purposes of archival, but the compressed information must be available at a later date for the purposes of identifying a new speaker. The new speaker may or may not have been recorded in the archived database. As would be expected, the ability to identify a particular speaker when compared to the compressed speech information is impaired, in a manner which is related to the degree of compression. Furthermore, automatic speaker recognition algorithms depend upon a parameterization of the speech which may not be available in the quantity required in the compressed data 'stream. We present here our results in identifying a speaker using two common methods applied to the data stream resulting from a class of spectral vector compression algorithms. It is shown experimentally that a simplified. easily-computed distance metric algorithm is somewhat more sensitive to the compression process when compared to a substantially more complex multivariate statistical modelling method.