针对语音反欺骗模型的可转移波形级对抗性攻击

2023 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2023-07-01 DOI:10.1109/ICME55011.2023.00395

Bingyuan Huang, Sanshuai Cui, Xiangui Kang, Enping Li

{"title":"针对语音反欺骗模型的可转移波形级对抗性攻击","authors":"Bingyuan Huang, Sanshuai Cui, Xiangui Kang, Enping Li","doi":"10.1109/ICME55011.2023.00395","DOIUrl":null,"url":null,"abstract":"Speech anti-spoofing models protect media from malicious fake speech but are vulnerable to adversarial attacks. Studies of adversarial attacks are conducive to developing robust speech anti-spoofing systems. Existing transfer-based attack methods mainly craft adversarial speech examples at the handcrafted-feature level, which have limited attack ability against the real-world anti-spoofing systems, as these systems only have raw waveform input interfaces. In this work, we propose a waveform-level input data transformation, called the temporal smoothing method, to generate more transferable adversarial speech examples. In the optimization iterations of the adversarial perturbation, we randomly smooth input waveforms to prevent the adversarial examples from overfitting white-box surrogate models. The proposed transformation can be combined with any iterative gradient-based attack method. Extensive experiments demonstrate that our method significantly enhances the transferability of waveform-level adversarial speech examples.","PeriodicalId":321830,"journal":{"name":"2023 IEEE International Conference on Multimedia and Expo (ICME)","volume":"69 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Transferable Waveform-level Adversarial Attack against Speech Anti-spoofing Models\",\"authors\":\"Bingyuan Huang, Sanshuai Cui, Xiangui Kang, Enping Li\",\"doi\":\"10.1109/ICME55011.2023.00395\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Speech anti-spoofing models protect media from malicious fake speech but are vulnerable to adversarial attacks. Studies of adversarial attacks are conducive to developing robust speech anti-spoofing systems. Existing transfer-based attack methods mainly craft adversarial speech examples at the handcrafted-feature level, which have limited attack ability against the real-world anti-spoofing systems, as these systems only have raw waveform input interfaces. In this work, we propose a waveform-level input data transformation, called the temporal smoothing method, to generate more transferable adversarial speech examples. In the optimization iterations of the adversarial perturbation, we randomly smooth input waveforms to prevent the adversarial examples from overfitting white-box surrogate models. The proposed transformation can be combined with any iterative gradient-based attack method. Extensive experiments demonstrate that our method significantly enhances the transferability of waveform-level adversarial speech examples.\",\"PeriodicalId\":321830,\"journal\":{\"name\":\"2023 IEEE International Conference on Multimedia and Expo (ICME)\",\"volume\":\"69 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE International Conference on Multimedia and Expo (ICME)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICME55011.2023.00395\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE International Conference on Multimedia and Expo (ICME)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICME55011.2023.00395","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

语音反欺骗模型保护媒体免受恶意虚假言论的攻击，但容易受到对抗性攻击。对抗性攻击的研究有助于开发健壮的语音反欺骗系统。现有的基于传输的攻击方法主要是在手工特征级别上制作对抗性语音示例，由于这些系统只有原始波形输入接口，因此对现实世界的防欺骗系统的攻击能力有限。在这项工作中，我们提出了一种波形级输入数据转换，称为时间平滑方法，以生成更多可转移的对抗性语音示例。在对抗性扰动的优化迭代中，我们随机平滑输入波形，以防止对抗性样本过度拟合白盒代理模型。所提出的变换可以与任何基于迭代梯度的攻击方法相结合。大量的实验表明，我们的方法显著提高了波形级对抗性语音示例的可转移性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Transferable Waveform-level Adversarial Attack against Speech Anti-spoofing Models

Speech anti-spoofing models protect media from malicious fake speech but are vulnerable to adversarial attacks. Studies of adversarial attacks are conducive to developing robust speech anti-spoofing systems. Existing transfer-based attack methods mainly craft adversarial speech examples at the handcrafted-feature level, which have limited attack ability against the real-world anti-spoofing systems, as these systems only have raw waveform input interfaces. In this work, we propose a waveform-level input data transformation, called the temporal smoothing method, to generate more transferable adversarial speech examples. In the optimization iterations of the adversarial perturbation, we randomly smooth input waveforms to prevent the adversarial examples from overfitting white-box surrogate models. The proposed transformation can be combined with any iterative gradient-based attack method. Extensive experiments demonstrate that our method significantly enhances the transferability of waveform-level adversarial speech examples.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2023 IEEE International Conference on Multimedia and Expo (ICME)

自引率

0.00%

发文量