通过语音转换去识别说话人

Qin Jin, Arthur R. Toth, Tanja Schultz, A. Black
{"title":"通过语音转换去识别说话人","authors":"Qin Jin, Arthur R. Toth, Tanja Schultz, A. Black","doi":"10.1109/ASRU.2009.5373356","DOIUrl":null,"url":null,"abstract":"It is a common feature of modern automated voice-driven applications and services to record and transmit a user's spoken request. At the same time, several domains and applications may require keeping the content of the user's request confidential and at the same time preserving the speaker's identity. This requires a technology that allows the speaker's voice to be de-identified in the sense that the voice sounds natural and intelligible but does not reveal the identity of the speaker. In this paper we investigate different voice transformation strategies on a large population of speakers to disguise the speakers' identities while preserving the intelligibility of the voices. We apply two automatic speaker identification approaches to verify the success of de-identification with voice transformation, a GMM-based and a Phonetic approach. The evaluation based on the automatic speaker identification systems verifies that the proposed voice transformation technique enables transmission of the content of the users' spoken requests while successfully preserving their identities. Also, the results indicate that different speakers still sound distinct after the transformation. Furthermore, we carried out a human listening test that proved the transformed speech to be both intelligible and securely de-identified, as it hid the identity of the speakers even to listeners who knew the speakers very well.","PeriodicalId":292194,"journal":{"name":"2009 IEEE Workshop on Automatic Speech Recognition & Understanding","volume":"88 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"55","resultStr":"{\"title\":\"Speaker de-identification via voice transformation\",\"authors\":\"Qin Jin, Arthur R. Toth, Tanja Schultz, A. Black\",\"doi\":\"10.1109/ASRU.2009.5373356\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"It is a common feature of modern automated voice-driven applications and services to record and transmit a user's spoken request. At the same time, several domains and applications may require keeping the content of the user's request confidential and at the same time preserving the speaker's identity. This requires a technology that allows the speaker's voice to be de-identified in the sense that the voice sounds natural and intelligible but does not reveal the identity of the speaker. In this paper we investigate different voice transformation strategies on a large population of speakers to disguise the speakers' identities while preserving the intelligibility of the voices. We apply two automatic speaker identification approaches to verify the success of de-identification with voice transformation, a GMM-based and a Phonetic approach. The evaluation based on the automatic speaker identification systems verifies that the proposed voice transformation technique enables transmission of the content of the users' spoken requests while successfully preserving their identities. Also, the results indicate that different speakers still sound distinct after the transformation. Furthermore, we carried out a human listening test that proved the transformed speech to be both intelligible and securely de-identified, as it hid the identity of the speakers even to listeners who knew the speakers very well.\",\"PeriodicalId\":292194,\"journal\":{\"name\":\"2009 IEEE Workshop on Automatic Speech Recognition & Understanding\",\"volume\":\"88 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"55\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2009 IEEE Workshop on Automatic Speech Recognition & Understanding\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ASRU.2009.5373356\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 IEEE Workshop on Automatic Speech Recognition & Understanding","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASRU.2009.5373356","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 55

摘要

记录和传输用户的语音请求是现代自动化语音驱动应用程序和服务的共同特征。与此同时,一些领域和应用程序可能需要对用户请求的内容保密,同时保留说话者的身份。这需要一种技术,可以让说话者的声音去识别,在某种意义上说,声音听起来自然和可理解,但不透露说话者的身份。本文研究了不同的语音转换策略,在大量说话人的情况下伪装说话人的身份,同时保持声音的可理解性。我们应用了两种自动说话人识别方法来验证语音转换去识别的成功,一种是基于gmm的方法,另一种是语音方法。基于自动说话人识别系统的评估验证了所提出的语音转换技术能够传输用户语音请求的内容,同时成功地保留其身份。结果表明,不同的说话人在转换后的声音仍然是不同的。此外,我们进行了一项人类听力测试,证明了转换后的语音既可理解又安全地去识别,因为它隐藏了说话者的身份,甚至对非常了解说话者的听众也是如此。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Speaker de-identification via voice transformation
It is a common feature of modern automated voice-driven applications and services to record and transmit a user's spoken request. At the same time, several domains and applications may require keeping the content of the user's request confidential and at the same time preserving the speaker's identity. This requires a technology that allows the speaker's voice to be de-identified in the sense that the voice sounds natural and intelligible but does not reveal the identity of the speaker. In this paper we investigate different voice transformation strategies on a large population of speakers to disguise the speakers' identities while preserving the intelligibility of the voices. We apply two automatic speaker identification approaches to verify the success of de-identification with voice transformation, a GMM-based and a Phonetic approach. The evaluation based on the automatic speaker identification systems verifies that the proposed voice transformation technique enables transmission of the content of the users' spoken requests while successfully preserving their identities. Also, the results indicate that different speakers still sound distinct after the transformation. Furthermore, we carried out a human listening test that proved the transformed speech to be both intelligible and securely de-identified, as it hid the identity of the speakers even to listeners who knew the speakers very well.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信