Feature Extraction Matters More: An Effective and Efficient Universal Deepfake Disruptor

IF 6 3区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

ACM Transactions on Multimedia Computing Communications and Applications Pub Date : 2024-03-20 DOI:10.1145/3653457

Long Tang, Dengpan Ye, Zhenhao Lu, Yunming Zhang, Chuanxi Chen

{"title":"Feature Extraction Matters More: An Effective and Efficient Universal Deepfake Disruptor","authors":"Long Tang, Dengpan Ye, Zhenhao Lu, Yunming Zhang, Chuanxi Chen","doi":"10.1145/3653457","DOIUrl":null,"url":null,"abstract":"Face manipulation can modify a victim’s facial attributes, e.g., age or hair color, in an image, which is an important component of DeepFakes. Adversarial examples are an emerging approach to combat the threat of visual misinformation to society. To efficiently protect facial images from being forged, designing a universal face anti-manipulation disruptor is essential. However, existing works treat deepfake disruption as an end-to-end process, ignoring the functional difference between feature extraction and image reconstruction. In this work, we propose a novel Feature-Output ensemble UNiversal Disruptor (FOUND) against face manipulation networks, which explores a new opinion considering attacking feature-extraction (encoding) modules as the critical task in deepfake disruption. We conduct an effective two-stage disruption process. We first perform ensemble disruption on multi-model encoders, maximizing the Wasserstein distance between features before and after the adversarial attack. Then develop a gradient-ensemble strategy to enhance the disruption effect by simplifying the complex optimization problem of disrupting ensemble end-to-end models. Extensive experiments indicate that one FOUND generated with a few facial images can successfully disrupt multiple face manipulation models on cross-attribute and cross-face images, surpassing state-of-the-art universal disruptors in both success rate and efficiency.","PeriodicalId":50937,"journal":{"name":"ACM Transactions on Multimedia Computing Communications and Applications","volume":"22 1","pages":""},"PeriodicalIF":6.0000,"publicationDate":"2024-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Multimedia Computing Communications and Applications","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3653457","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Face manipulation can modify a victim’s facial attributes, e.g., age or hair color, in an image, which is an important component of DeepFakes. Adversarial examples are an emerging approach to combat the threat of visual misinformation to society. To efficiently protect facial images from being forged, designing a universal face anti-manipulation disruptor is essential. However, existing works treat deepfake disruption as an end-to-end process, ignoring the functional difference between feature extraction and image reconstruction. In this work, we propose a novel Feature-Output ensemble UNiversal Disruptor (FOUND) against face manipulation networks, which explores a new opinion considering attacking feature-extraction (encoding) modules as the critical task in deepfake disruption. We conduct an effective two-stage disruption process. We first perform ensemble disruption on multi-model encoders, maximizing the Wasserstein distance between features before and after the adversarial attack. Then develop a gradient-ensemble strategy to enhance the disruption effect by simplifying the complex optimization problem of disrupting ensemble end-to-end models. Extensive experiments indicate that one FOUND generated with a few facial images can successfully disrupt multiple face manipulation models on cross-attribute and cross-face images, surpassing state-of-the-art universal disruptors in both success rate and efficiency.

查看原文本刊更多论文

特征提取更重要高效通用的深度伪造干扰器

人脸操作可以修改图像中受害者的面部属性，如年龄或发色，这是 DeepFakes 的重要组成部分。对抗性实例是一种新兴的方法，可用于消除视觉错误信息对社会的威胁。为了有效保护面部图像不被伪造，设计一种通用的面部防操纵干扰器至关重要。然而，现有的工作将深度防伪作为一个端到端的过程，忽略了特征提取和图像重建之间的功能差异。在这项工作中，我们提出了一种新型的特征-输出集合通用干扰器（FOUND）来对付人脸操纵网络，它探索了一种新的观点，将攻击特征提取（编码）模块作为深度防伪干扰的关键任务。我们进行了有效的两阶段破坏过程。我们首先对多模型编码器进行集合干扰，最大化对抗性攻击前后特征之间的瓦瑟斯坦距离。然后开发一种梯度-集合策略，通过简化破坏集合端到端模型的复杂优化问题来增强破坏效果。大量实验表明，用少量面部图像生成的一个 FOUND 可以成功地破坏跨属性和跨面部图像上的多个人脸操作模型，在成功率和效率上都超过了最先进的通用破坏器。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ACM Transactions on Multimedia Computing Communications and Applications 工程技术-计算机：理论方法

CiteScore

8.50

自引率

5.90%

发文量

285

审稿时长

7.5 months

期刊介绍： The ACM Transactions on Multimedia Computing, Communications, and Applications is the flagship publication of the ACM Special Interest Group in Multimedia (SIGMM). It is soliciting paper submissions on all aspects of multimedia. Papers on single media (for instance, audio, video, animation) and their processing are also welcome. TOMM is a peer-reviewed, archival journal, available in both print form and digital form. The Journal is published quarterly; with roughly 7 23-page articles in each issue. In addition, all Special Issues are published online-only to ensure a timely publication. The transactions consists primarily of research papers. This is an archival journal and it is intended that the papers will have lasting importance and value over time. In general, papers whose primary focus is on particular multimedia products or the current state of the industry will not be included.