Yoshitaka Ohshio, Haruka Adachi, Kenta Iwai, T. Nishiura, Y. Yamashita
{"title":"Active Speech Obscuration with Speaker-dependent Human Speech-like Noise for Speech Privacy","authors":"Yoshitaka Ohshio, Haruka Adachi, Kenta Iwai, T. Nishiura, Y. Yamashita","doi":"10.23919/APSIPA.2018.8659754","DOIUrl":null,"url":null,"abstract":"This paper introduces a new active speech obscuration with speaker-dependent human speech-like noise (HSLN) for speech privacy. Recently, speech privacy is regarded as an important issue in open public spaces such as hospitals, pharmacies, banks, and so on. To protect speech privacy, speech obscuration methods utilizing HSLN have been studied. HSLNs are designed by superposing various speech signals and speech obscuration is achieved by hearing the target speech and HSLN at the same time. Conventionally, HSLN is designed with the pitch of the target speech as the sole speaker-dependent characteristic. However, additional speaker-dependent characteristics are required because the performance of speech obscuration is still insufficient. Therefore, we propose a speaker-dependent HSLN design method for effective speech obscuration that uses the third formant frequency of the target speech in addition to pitch as speaker-dependent characteristics. The third formant frequency is related to voice quality, which depends on the shape and length of the vocal tract. It follows that the proposed method can effectively mask the target speech by the HSLN considering the pitch and third formant frequency, which are analyzed from the speech. Experimental results demonstrate the effectiveness of the proposed method.","PeriodicalId":287799,"journal":{"name":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/APSIPA.2018.8659754","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
This paper introduces a new active speech obscuration with speaker-dependent human speech-like noise (HSLN) for speech privacy. Recently, speech privacy is regarded as an important issue in open public spaces such as hospitals, pharmacies, banks, and so on. To protect speech privacy, speech obscuration methods utilizing HSLN have been studied. HSLNs are designed by superposing various speech signals and speech obscuration is achieved by hearing the target speech and HSLN at the same time. Conventionally, HSLN is designed with the pitch of the target speech as the sole speaker-dependent characteristic. However, additional speaker-dependent characteristics are required because the performance of speech obscuration is still insufficient. Therefore, we propose a speaker-dependent HSLN design method for effective speech obscuration that uses the third formant frequency of the target speech in addition to pitch as speaker-dependent characteristics. The third formant frequency is related to voice quality, which depends on the shape and length of the vocal tract. It follows that the proposed method can effectively mask the target speech by the HSLN considering the pitch and third formant frequency, which are analyzed from the speech. Experimental results demonstrate the effectiveness of the proposed method.