Unleashing Worms and Extracting Data: Escalating the Outcome of Attacks against RAG-based Inference in Scale and Severity Using Jailbreaking

Stav Cohen, Ron Bitton, Ben Nassi
{"title":"Unleashing Worms and Extracting Data: Escalating the Outcome of Attacks against RAG-based Inference in Scale and Severity Using Jailbreaking","authors":"Stav Cohen, Ron Bitton, Ben Nassi","doi":"arxiv-2409.08045","DOIUrl":null,"url":null,"abstract":"In this paper, we show that with the ability to jailbreak a GenAI model,\nattackers can escalate the outcome of attacks against RAG-based GenAI-powered\napplications in severity and scale. In the first part of the paper, we show\nthat attackers can escalate RAG membership inference attacks and RAG entity\nextraction attacks to RAG documents extraction attacks, forcing a more severe\noutcome compared to existing attacks. We evaluate the results obtained from\nthree extraction methods, the influence of the type and the size of five\nembeddings algorithms employed, the size of the provided context, and the GenAI\nengine. We show that attackers can extract 80%-99.8% of the data stored in the\ndatabase used by the RAG of a Q&A chatbot. In the second part of the paper, we\nshow that attackers can escalate the scale of RAG data poisoning attacks from\ncompromising a single GenAI-powered application to compromising the entire\nGenAI ecosystem, forcing a greater scale of damage. This is done by crafting an\nadversarial self-replicating prompt that triggers a chain reaction of a\ncomputer worm within the ecosystem and forces each affected application to\nperform a malicious activity and compromise the RAG of additional applications.\nWe evaluate the performance of the worm in creating a chain of confidential\ndata extraction about users within a GenAI ecosystem of GenAI-powered email\nassistants and analyze how the performance of the worm is affected by the size\nof the context, the adversarial self-replicating prompt used, the type and size\nof the embeddings algorithm employed, and the number of hops in the\npropagation. Finally, we review and analyze guardrails to protect RAG-based\ninference and discuss the tradeoffs.","PeriodicalId":501332,"journal":{"name":"arXiv - CS - Cryptography and Security","volume":"5 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Cryptography and Security","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.08045","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

In this paper, we show that with the ability to jailbreak a GenAI model, attackers can escalate the outcome of attacks against RAG-based GenAI-powered applications in severity and scale. In the first part of the paper, we show that attackers can escalate RAG membership inference attacks and RAG entity extraction attacks to RAG documents extraction attacks, forcing a more severe outcome compared to existing attacks. We evaluate the results obtained from three extraction methods, the influence of the type and the size of five embeddings algorithms employed, the size of the provided context, and the GenAI engine. We show that attackers can extract 80%-99.8% of the data stored in the database used by the RAG of a Q&A chatbot. In the second part of the paper, we show that attackers can escalate the scale of RAG data poisoning attacks from compromising a single GenAI-powered application to compromising the entire GenAI ecosystem, forcing a greater scale of damage. This is done by crafting an adversarial self-replicating prompt that triggers a chain reaction of a computer worm within the ecosystem and forces each affected application to perform a malicious activity and compromise the RAG of additional applications. We evaluate the performance of the worm in creating a chain of confidential data extraction about users within a GenAI ecosystem of GenAI-powered email assistants and analyze how the performance of the worm is affected by the size of the context, the adversarial self-replicating prompt used, the type and size of the embeddings algorithm employed, and the number of hops in the propagation. Finally, we review and analyze guardrails to protect RAG-based inference and discuss the tradeoffs.
释放蠕虫和提取数据:利用 "越狱 "技术扩大对基于 RAG 推断的攻击结果的规模和严重程度
在本文中,我们展示了利用 GenAI 模型越狱的能力,攻击者可以在严重程度和规模上升级对基于 RAG 的 GenAI-powered 应用程序的攻击结果。在本文的第一部分,我们展示了攻击者可以将RAG成员推理攻击和RAG实体提取攻击升级为RAG文档提取攻击,从而迫使攻击结果比现有攻击更加严重。我们评估了三种提取方法的结果、所采用的五种嵌入算法的类型和大小、所提供上下文的大小以及 GenAIengine 的影响。我们的研究表明,攻击者可以提取存储在问答聊天机器人 RAG 使用的数据库中的 80%-99.8% 的数据。在论文的第二部分,我们展示了攻击者可以将 RAG 数据中毒攻击的规模从破坏单个 GenAI 驱动的应用程序升级到破坏整个 GenAI 生态系统,从而造成更大范围的破坏。具体做法是制作一个对抗性的自我复制提示,在生态系统中触发计算机蠕虫的连锁反应,迫使每个受影响的应用程序执行恶意活动,并破坏其他应用程序的 RAG。我们评估了该蠕虫在由 GenAI 驱动的电子邮件助手组成的 GenAI 生态系统中创建用户机密数据提取链时的性能,并分析了蠕虫的性能如何受到上下文大小、所使用的对抗性自我复制提示、所使用的嵌入算法类型和大小以及传播跳数的影响。最后,我们回顾并分析了保护基于 RAG 的推理的防护措施,并讨论了其中的权衡问题。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信