EIA: Environmental Injection Attack on Generalist Web Agents for Privacy Leakage

arXiv - CS - Cryptography and Security Pub Date : 2024-09-17 DOI:arxiv-2409.11295

Zeyi Liao, Lingbo Mo, Chejian Xu, Mintong Kang, Jiawei Zhang, Chaowei Xiao, Yuan Tian, Bo Li, Huan Sun

{"title":"EIA: Environmental Injection Attack on Generalist Web Agents for Privacy Leakage","authors":"Zeyi Liao, Lingbo Mo, Chejian Xu, Mintong Kang, Jiawei Zhang, Chaowei Xiao, Yuan Tian, Bo Li, Huan Sun","doi":"arxiv-2409.11295","DOIUrl":null,"url":null,"abstract":"Generalist web agents have evolved rapidly and demonstrated remarkable\npotential. However, there are unprecedented safety risks associated with these\nthem, which are nearly unexplored so far. In this work, we aim to narrow this\ngap by conducting the first study on the privacy risks of generalist web agents\nin adversarial environments. First, we present a threat model that discusses\nthe adversarial targets, constraints, and attack scenarios. Particularly, we\nconsider two types of adversarial targets: stealing users' specific personally\nidentifiable information (PII) or stealing the entire user request. To achieve\nthese objectives, we propose a novel attack method, termed Environmental\nInjection Attack (EIA). This attack injects malicious content designed to adapt\nwell to different environments where the agents operate, causing them to\nperform unintended actions. This work instantiates EIA specifically for the\nprivacy scenario. It inserts malicious web elements alongside persuasive\ninstructions that mislead web agents into leaking private information, and can\nfurther leverage CSS and JavaScript features to remain stealthy. We collect 177\nactions steps that involve diverse PII categories on realistic websites from\nthe Mind2Web dataset, and conduct extensive experiments using one of the most\ncapable generalist web agent frameworks to date, SeeAct. The results\ndemonstrate that EIA achieves up to 70% ASR in stealing users' specific PII.\nStealing full user requests is more challenging, but a relaxed version of EIA\ncan still achieve 16% ASR. Despite these concerning results, it is important to\nnote that the attack can still be detectable through careful human inspection,\nhighlighting a trade-off between high autonomy and security. This leads to our\ndetailed discussion on the efficacy of EIA under different levels of human\nsupervision as well as implications on defenses for generalist web agents.","PeriodicalId":501332,"journal":{"name":"arXiv - CS - Cryptography and Security","volume":"30 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Cryptography and Security","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11295","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Generalist web agents have evolved rapidly and demonstrated remarkable potential. However, there are unprecedented safety risks associated with these them, which are nearly unexplored so far. In this work, we aim to narrow this gap by conducting the first study on the privacy risks of generalist web agents in adversarial environments. First, we present a threat model that discusses the adversarial targets, constraints, and attack scenarios. Particularly, we consider two types of adversarial targets: stealing users' specific personally identifiable information (PII) or stealing the entire user request. To achieve these objectives, we propose a novel attack method, termed Environmental Injection Attack (EIA). This attack injects malicious content designed to adapt well to different environments where the agents operate, causing them to perform unintended actions. This work instantiates EIA specifically for the privacy scenario. It inserts malicious web elements alongside persuasive instructions that mislead web agents into leaking private information, and can further leverage CSS and JavaScript features to remain stealthy. We collect 177 actions steps that involve diverse PII categories on realistic websites from the Mind2Web dataset, and conduct extensive experiments using one of the most capable generalist web agent frameworks to date, SeeAct. The results demonstrate that EIA achieves up to 70% ASR in stealing users' specific PII. Stealing full user requests is more challenging, but a relaxed version of EIA can still achieve 16% ASR. Despite these concerning results, it is important to note that the attack can still be detectable through careful human inspection, highlighting a trade-off between high autonomy and security. This leads to our detailed discussion on the efficacy of EIA under different levels of human supervision as well as implications on defenses for generalist web agents.

查看原文本刊更多论文

EIA：对通用网络代理进行环境注入攻击以泄露隐私

通用网络制剂发展迅速，潜力巨大。然而，与之相关的安全风险也是前所未有的，迄今为止几乎还没有人对此进行过探索。在这项工作中，我们首次研究了对抗环境下通用网络代理的隐私风险，旨在缩小这一差距。首先，我们提出了一个威胁模型，讨论了敌对目标、约束条件和攻击场景。特别是，我们考虑了两种类型的敌对目标：窃取用户特定的个人身份信息（PII）或窃取整个用户请求。为了实现这些目标，我们提出了一种新型攻击方法，称为环境注入攻击（EIA）。这种攻击会注入恶意内容，使其能够很好地适应代理运行的不同环境，从而导致代理执行意想不到的操作。这项工作专门针对隐私场景实例化了 EIA。它将恶意网页元素与有说服力的指令一起插入，误导网络代理泄露隐私信息，并可进一步利用 CSS 和 JavaScript 功能保持隐蔽性。我们从 Mind2Web 数据集中收集了 177 个涉及现实网站中各种 PII 类别的操作步骤，并使用迄今为止能力最强的通用网络代理框架之一 SeeAct 进行了广泛的实验。结果表明，EIA 在窃取用户特定 PII 方面的 ASR 高达 70%。窃取完整的用户请求更具挑战性，但 EIA 的宽松版本仍能达到 16% 的 ASR。尽管这些结果令人担忧，但重要的是要注意，这种攻击仍然可以通过仔细的人工检查检测出来，这就突出了高自主性和安全性之间的权衡。由此，我们将详细讨论不同人类监督水平下 EIA 的有效性，以及对通用网络代理防御的影响。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

arXiv - CS - Cryptography and Security

自引率

0.00%

发文量