基于对抗性鲁棒水印的双重防御语音合成攻击

IF 3.2 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC
Yulin He;Hongxia Wang;Yiqin Qiu;Hao Cao
{"title":"基于对抗性鲁棒水印的双重防御语音合成攻击","authors":"Yulin He;Hongxia Wang;Yiqin Qiu;Hao Cao","doi":"10.1109/LSP.2025.3562817","DOIUrl":null,"url":null,"abstract":"Given the widespread dissemination of digital audio and the advancements in speech synthesis technologies, protecting audio copyright has become a critical issue. Although watermarks play an important role in copyright verification and forensic analysis, they are insufficient to proactively defend against malicious speech synthesis. To address this issue, we introduce a novel adversarial speech synthesis watermarking mechanism (ASSMark), which simultaneously traces the audio copyright and disrupts the speech synthesis models by embedding robust adversarial watermarks in a one-time manner. Specifically, we design a unified training framework that models the embedding of watermarks and adversarial perturbations as collaborative tasks. This approach allows for the fine-tuning of any robust watermark into an adversarial watermark, resulting in watermarked audio that can effectively defend against unauthorized speech synthesis attacks. Experimental results demonstrate that ASSMark achieves over 90% protection rate even to unknown black-box models. Compared to simplistic two-step protection methods, it not only effectively resists synthesis attacks but also achieves superior watermark extraction accuracy and speech quality, offering an outstanding solution for protecting audio copyright.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"32 ","pages":"1870-1874"},"PeriodicalIF":3.2000,"publicationDate":"2025-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"ASSMark: Dual Defense Against Speech Synthesis Attack via Adversarial Robust Watermarking\",\"authors\":\"Yulin He;Hongxia Wang;Yiqin Qiu;Hao Cao\",\"doi\":\"10.1109/LSP.2025.3562817\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Given the widespread dissemination of digital audio and the advancements in speech synthesis technologies, protecting audio copyright has become a critical issue. Although watermarks play an important role in copyright verification and forensic analysis, they are insufficient to proactively defend against malicious speech synthesis. To address this issue, we introduce a novel adversarial speech synthesis watermarking mechanism (ASSMark), which simultaneously traces the audio copyright and disrupts the speech synthesis models by embedding robust adversarial watermarks in a one-time manner. Specifically, we design a unified training framework that models the embedding of watermarks and adversarial perturbations as collaborative tasks. This approach allows for the fine-tuning of any robust watermark into an adversarial watermark, resulting in watermarked audio that can effectively defend against unauthorized speech synthesis attacks. Experimental results demonstrate that ASSMark achieves over 90% protection rate even to unknown black-box models. Compared to simplistic two-step protection methods, it not only effectively resists synthesis attacks but also achieves superior watermark extraction accuracy and speech quality, offering an outstanding solution for protecting audio copyright.\",\"PeriodicalId\":13154,\"journal\":{\"name\":\"IEEE Signal Processing Letters\",\"volume\":\"32 \",\"pages\":\"1870-1874\"},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2025-04-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Signal Processing Letters\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10971213/\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Signal Processing Letters","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10971213/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

摘要

随着数字音频的广泛传播和语音合成技术的进步,音频版权保护已成为一个关键问题。尽管水印在版权验证和取证分析中发挥了重要作用,但它不足以主动防御恶意语音合成。为了解决这个问题,我们引入了一种新的对抗性语音合成水印机制(ASSMark),该机制通过一次性嵌入鲁棒的对抗性水印来同时跟踪音频版权并破坏语音合成模型。具体来说,我们设计了一个统一的训练框架,将水印和对抗性扰动的嵌入建模为协作任务。这种方法允许将任何鲁棒水印微调为对抗水印,从而产生可以有效防御未经授权的语音合成攻击的带水印音频。实验结果表明,即使对未知的黑盒模型,ASSMark也能达到90%以上的保护率。与简单的两步保护方法相比,该方法不仅能有效抵抗合成攻击,而且水印提取精度和语音质量都较好,为音频版权保护提供了一种出色的解决方案。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
ASSMark: Dual Defense Against Speech Synthesis Attack via Adversarial Robust Watermarking
Given the widespread dissemination of digital audio and the advancements in speech synthesis technologies, protecting audio copyright has become a critical issue. Although watermarks play an important role in copyright verification and forensic analysis, they are insufficient to proactively defend against malicious speech synthesis. To address this issue, we introduce a novel adversarial speech synthesis watermarking mechanism (ASSMark), which simultaneously traces the audio copyright and disrupts the speech synthesis models by embedding robust adversarial watermarks in a one-time manner. Specifically, we design a unified training framework that models the embedding of watermarks and adversarial perturbations as collaborative tasks. This approach allows for the fine-tuning of any robust watermark into an adversarial watermark, resulting in watermarked audio that can effectively defend against unauthorized speech synthesis attacks. Experimental results demonstrate that ASSMark achieves over 90% protection rate even to unknown black-box models. Compared to simplistic two-step protection methods, it not only effectively resists synthesis attacks but also achieves superior watermark extraction accuracy and speech quality, offering an outstanding solution for protecting audio copyright.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
IEEE Signal Processing Letters
IEEE Signal Processing Letters 工程技术-工程:电子与电气
CiteScore
7.40
自引率
12.80%
发文量
339
审稿时长
2.8 months
期刊介绍: The IEEE Signal Processing Letters is a monthly, archival publication designed to provide rapid dissemination of original, cutting-edge ideas and timely, significant contributions in signal, image, speech, language and audio processing. Papers published in the Letters can be presented within one year of their appearance in signal processing conferences such as ICASSP, GlobalSIP and ICIP, and also in several workshop organized by the Signal Processing Society.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信