UltraAdv：一种针对闭盒语音识别系统的超声波对抗性攻击

IF 9.2 2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Transactions on Mobile Computing Pub Date : 2025-03-31 DOI:10.1109/TMC.2025.3555680

Guoming Zhang;Xiaohui Ma;Huiting Zhang;Riccardo Spolaor;Yanni Yang;Xiaoyu Ji;Xiuzhen Cheng;Pengfei Hu

{"title":"UltraAdv：一种针对闭盒语音识别系统的超声波对抗性攻击","authors":"Guoming Zhang;Xiaohui Ma;Huiting Zhang;Riccardo Spolaor;Yanni Yang;Xiaoyu Ji;Xiuzhen Cheng;Pengfei Hu","doi":"10.1109/TMC.2025.3555680","DOIUrl":null,"url":null,"abstract":"Attacks on speech recognition systems often use adversarial or inaudible commands. However, a challenge is that adversarial perturbations typically fall within the audible frequency range, making it difficult to achieve inaudibility. Additionally, the non-linear effects of loudspeakers often cause inaudible commands to become audible at higher power levels. Therefore, minimizing the power requirements of the attack is essential to maintain inaudibility. Another significant obstacle is the conversion of variable-length commands, especially longer ones, into shorter target commands. In this paper, we present UltraAdv, a method for generating long-range adversarial perturbations capable of compromising commands of arbitrary length in closed-box setting. By combining the ultrasonic signal with the normal one, rather than negating it as in DolphinAttack, we significantly improve the energy efficiency, thus enhancing its attack distance. We also propose a dynamically adjustable suppression-interference method based on automatic gain control to address the challenge of mismatched durations between long commands and target commands (length-independent). Experiments demonstrate that using a single perturbation, we achieve impressive success rates of 98.84% and 96.62% and 98.32% across a diverse set of 12,260 speeches on DeepSpeech, iFlytek, and Whisper. The attack range reaches up to 15 m, surpassing DolphinAttack's 5 m range at equivalent power.","PeriodicalId":50389,"journal":{"name":"IEEE Transactions on Mobile Computing","volume":"24 8","pages":"7648-7662"},"PeriodicalIF":9.2000,"publicationDate":"2025-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"UltraAdv: An Ultrasonic Adversarial Attack on Closed-Box Speech Recognition Systems\",\"authors\":\"Guoming Zhang;Xiaohui Ma;Huiting Zhang;Riccardo Spolaor;Yanni Yang;Xiaoyu Ji;Xiuzhen Cheng;Pengfei Hu\",\"doi\":\"10.1109/TMC.2025.3555680\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Attacks on speech recognition systems often use adversarial or inaudible commands. However, a challenge is that adversarial perturbations typically fall within the audible frequency range, making it difficult to achieve inaudibility. Additionally, the non-linear effects of loudspeakers often cause inaudible commands to become audible at higher power levels. Therefore, minimizing the power requirements of the attack is essential to maintain inaudibility. Another significant obstacle is the conversion of variable-length commands, especially longer ones, into shorter target commands. In this paper, we present UltraAdv, a method for generating long-range adversarial perturbations capable of compromising commands of arbitrary length in closed-box setting. By combining the ultrasonic signal with the normal one, rather than negating it as in DolphinAttack, we significantly improve the energy efficiency, thus enhancing its attack distance. We also propose a dynamically adjustable suppression-interference method based on automatic gain control to address the challenge of mismatched durations between long commands and target commands (length-independent). Experiments demonstrate that using a single perturbation, we achieve impressive success rates of 98.84% and 96.62% and 98.32% across a diverse set of 12,260 speeches on DeepSpeech, iFlytek, and Whisper. The attack range reaches up to 15 m, surpassing DolphinAttack's 5 m range at equivalent power.\",\"PeriodicalId\":50389,\"journal\":{\"name\":\"IEEE Transactions on Mobile Computing\",\"volume\":\"24 8\",\"pages\":\"7648-7662\"},\"PeriodicalIF\":9.2000,\"publicationDate\":\"2025-03-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Mobile Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10946237/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Mobile Computing","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10946237/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

对语音识别系统的攻击通常使用对抗性或听不见的命令。然而，一个挑战是对抗性扰动通常落在可听频率范围内，这使得难以实现听不见。此外，扬声器的非线性效应通常会导致不可听的命令在更高的功率水平下变成可听的。因此，最小化攻击的功率要求对于保持不可听性至关重要。另一个重要的障碍是将变长命令（尤其是较长的命令）转换为较短的目标命令。在本文中，我们提出了UltraAdv，一种在闭盒设置中产生能够破坏任意长度命令的远程对抗性摄动的方法。我们将超声波信号与正常信号相结合，而不是像DolphinAttack那样将其否定，从而显著提高了能量效率，从而增加了它的攻击距离。我们还提出了一种基于自动增益控制的动态可调抑制干扰方法，以解决长命令和目标命令（长度无关）之间持续时间不匹配的挑战。实验表明，使用单个扰动，我们在DeepSpeech、科大讯飞和Whisper上的12260个不同演讲中实现了令人印象深刻的成功率，分别为98.84%、96.62%和98.32%。攻击距离可达15米，超过海豚攻击在同等威力下的5米射程。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

UltraAdv: An Ultrasonic Adversarial Attack on Closed-Box Speech Recognition Systems

Attacks on speech recognition systems often use adversarial or inaudible commands. However, a challenge is that adversarial perturbations typically fall within the audible frequency range, making it difficult to achieve inaudibility. Additionally, the non-linear effects of loudspeakers often cause inaudible commands to become audible at higher power levels. Therefore, minimizing the power requirements of the attack is essential to maintain inaudibility. Another significant obstacle is the conversion of variable-length commands, especially longer ones, into shorter target commands. In this paper, we present UltraAdv, a method for generating long-range adversarial perturbations capable of compromising commands of arbitrary length in closed-box setting. By combining the ultrasonic signal with the normal one, rather than negating it as in DolphinAttack, we significantly improve the energy efficiency, thus enhancing its attack distance. We also propose a dynamically adjustable suppression-interference method based on automatic gain control to address the challenge of mismatched durations between long commands and target commands (length-independent). Experiments demonstrate that using a single perturbation, we achieve impressive success rates of 98.84% and 96.62% and 98.32% across a diverse set of 12,260 speeches on DeepSpeech, iFlytek, and Whisper. The attack range reaches up to 15 m, surpassing DolphinAttack's 5 m range at equivalent power.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Mobile Computing 工程技术-电信学

CiteScore

12.90

自引率

2.50%

发文量

403

审稿时长

6.6 months

期刊介绍： IEEE Transactions on Mobile Computing addresses key technical issues related to various aspects of mobile computing. This includes (a) architectures, (b) support services, (c) algorithm/protocol design and analysis, (d) mobile environments, (e) mobile communication systems, (f) applications, and (g) emerging technologies. Topics of interest span a wide range, covering aspects like mobile networks and hosts, mobility management, multimedia, operating system support, power management, online and mobile environments, security, scalability, reliability, and emerging technologies such as wearable computers, body area networks, and wireless sensor networks. The journal serves as a comprehensive platform for advancements in mobile computing research.