{"title":"Speaker Anonymization for Machines using Sinusoidal Model","authors":"Ayush Agarwal, Amitabh Swain, S. Prasanna","doi":"10.1109/SPCOM55316.2022.9840792","DOIUrl":null,"url":null,"abstract":"With the widespread use of speech technologies, speaker identity/voiceprint protection has become very important. Many methods have been proposed in the literature that protects the speaker’s identity either by modifying the voice or replacing it with another speaker’s identity. Both authentication systems and humans cannot recognize the speaker’s identity in those approaches. Changing the speaker identity of original speech cannot be used for the applications in which we want to conceal speaker identity from machine authentication and, at the same time, keep the speaker’s voice as it is. Noise addition methods have been proposed in the literature to address this issue. However, adding noise to the signal increases the irritation effect on speech perception. This paper proposes a sinusoidal model-based approach that solves this issue. The proposed method does not interfere with the originality of speech but, at the same time, protects the speaker’s identity for the automatic speaker verification (ASV) system by degrading its performance. The proposed approach’s anonymized speech is tested on the ASV system for TIMIT and IITG-MV datasets, and an equal error rate (EER) is reported. Intelligence tests like short-time objective intelligibility (STOI) and mean opinion score (MOS) is also done. By taking both EER and intelligibility tests together into consideration, it is shown that the proposed approach can solve the discussed issue.","PeriodicalId":246982,"journal":{"name":"2022 IEEE International Conference on Signal Processing and Communications (SPCOM)","volume":"194 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Signal Processing and Communications (SPCOM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SPCOM55316.2022.9840792","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
With the widespread use of speech technologies, speaker identity/voiceprint protection has become very important. Many methods have been proposed in the literature that protects the speaker’s identity either by modifying the voice or replacing it with another speaker’s identity. Both authentication systems and humans cannot recognize the speaker’s identity in those approaches. Changing the speaker identity of original speech cannot be used for the applications in which we want to conceal speaker identity from machine authentication and, at the same time, keep the speaker’s voice as it is. Noise addition methods have been proposed in the literature to address this issue. However, adding noise to the signal increases the irritation effect on speech perception. This paper proposes a sinusoidal model-based approach that solves this issue. The proposed method does not interfere with the originality of speech but, at the same time, protects the speaker’s identity for the automatic speaker verification (ASV) system by degrading its performance. The proposed approach’s anonymized speech is tested on the ASV system for TIMIT and IITG-MV datasets, and an equal error rate (EER) is reported. Intelligence tests like short-time objective intelligibility (STOI) and mean opinion score (MOS) is also done. By taking both EER and intelligibility tests together into consideration, it is shown that the proposed approach can solve the discussed issue.