Zeyi Liao, Lingbo Mo, Chejian Xu, Mintong Kang, Jiawei Zhang, Chaowei Xiao, Yuan Tian, Bo Li, Huan Sun
{"title":"EIA: Environmental Injection Attack on Generalist Web Agents for Privacy Leakage","authors":"Zeyi Liao, Lingbo Mo, Chejian Xu, Mintong Kang, Jiawei Zhang, Chaowei Xiao, Yuan Tian, Bo Li, Huan Sun","doi":"arxiv-2409.11295","DOIUrl":null,"url":null,"abstract":"Generalist web agents have evolved rapidly and demonstrated remarkable\npotential. However, there are unprecedented safety risks associated with these\nthem, which are nearly unexplored so far. In this work, we aim to narrow this\ngap by conducting the first study on the privacy risks of generalist web agents\nin adversarial environments. First, we present a threat model that discusses\nthe adversarial targets, constraints, and attack scenarios. Particularly, we\nconsider two types of adversarial targets: stealing users' specific personally\nidentifiable information (PII) or stealing the entire user request. To achieve\nthese objectives, we propose a novel attack method, termed Environmental\nInjection Attack (EIA). This attack injects malicious content designed to adapt\nwell to different environments where the agents operate, causing them to\nperform unintended actions. This work instantiates EIA specifically for the\nprivacy scenario. It inserts malicious web elements alongside persuasive\ninstructions that mislead web agents into leaking private information, and can\nfurther leverage CSS and JavaScript features to remain stealthy. We collect 177\nactions steps that involve diverse PII categories on realistic websites from\nthe Mind2Web dataset, and conduct extensive experiments using one of the most\ncapable generalist web agent frameworks to date, SeeAct. The results\ndemonstrate that EIA achieves up to 70% ASR in stealing users' specific PII.\nStealing full user requests is more challenging, but a relaxed version of EIA\ncan still achieve 16% ASR. Despite these concerning results, it is important to\nnote that the attack can still be detectable through careful human inspection,\nhighlighting a trade-off between high autonomy and security. This leads to our\ndetailed discussion on the efficacy of EIA under different levels of human\nsupervision as well as implications on defenses for generalist web agents.","PeriodicalId":501332,"journal":{"name":"arXiv - CS - Cryptography and Security","volume":"30 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Cryptography and Security","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11295","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Generalist web agents have evolved rapidly and demonstrated remarkable
potential. However, there are unprecedented safety risks associated with these
them, which are nearly unexplored so far. In this work, we aim to narrow this
gap by conducting the first study on the privacy risks of generalist web agents
in adversarial environments. First, we present a threat model that discusses
the adversarial targets, constraints, and attack scenarios. Particularly, we
consider two types of adversarial targets: stealing users' specific personally
identifiable information (PII) or stealing the entire user request. To achieve
these objectives, we propose a novel attack method, termed Environmental
Injection Attack (EIA). This attack injects malicious content designed to adapt
well to different environments where the agents operate, causing them to
perform unintended actions. This work instantiates EIA specifically for the
privacy scenario. It inserts malicious web elements alongside persuasive
instructions that mislead web agents into leaking private information, and can
further leverage CSS and JavaScript features to remain stealthy. We collect 177
actions steps that involve diverse PII categories on realistic websites from
the Mind2Web dataset, and conduct extensive experiments using one of the most
capable generalist web agent frameworks to date, SeeAct. The results
demonstrate that EIA achieves up to 70% ASR in stealing users' specific PII.
Stealing full user requests is more challenging, but a relaxed version of EIA
can still achieve 16% ASR. Despite these concerning results, it is important to
note that the attack can still be detectable through careful human inspection,
highlighting a trade-off between high autonomy and security. This leads to our
detailed discussion on the efficacy of EIA under different levels of human
supervision as well as implications on defenses for generalist web agents.