大型语言模型（LLM）及其代理的数据隐私保护：文献综述

IF 3 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

High-Confidence Computing Pub Date : 2025-02-28 DOI:10.1016/j.hcc.2025.100300

Biwei Yan , Kun Li , Minghui Xu , Yueyan Dong , Yue Zhang , Zhaochun Ren , Xiuzhen Cheng

{"title":"大型语言模型（LLM）及其代理的数据隐私保护：文献综述","authors":"Biwei Yan , Kun Li , Minghui Xu , Yueyan Dong , Yue Zhang , Zhaochun Ren , Xiuzhen Cheng","doi":"10.1016/j.hcc.2025.100300","DOIUrl":null,"url":null,"abstract":"<div><div>Large Language Models (LLMs) are complex artificial intelligence systems, which can understand, generate, and translate human languages. By analyzing large amounts of textual data, these models learn language patterns to perform tasks such as writing, conversation, and summarization. Agents built on LLMs (LLM agents) further extend these capabilities, allowing them to process user interactions and perform complex operations in diverse task environments. However, during the processing and generation of massive data, LLMs and LLM agents pose a risk of sensitive information leakage, potentially threatening data privacy. This paper aims to demonstrate data privacy issues associated with LLMs and LLM agents to facilitate a comprehensive understanding. Specifically, we conduct an in-depth survey about privacy threats, encompassing passive privacy leakage and active privacy attacks. Subsequently, we introduce the privacy protection mechanisms employed by LLMs and LLM agents and provide a detailed analysis of their effectiveness. Finally, we explore the privacy protection challenges for LLMs and LLM agents as well as outline potential directions for future developments in this domain.</div></div>","PeriodicalId":100605,"journal":{"name":"High-Confidence Computing","volume":"5 2","pages":"Article 100300"},"PeriodicalIF":3.0000,"publicationDate":"2025-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"On protecting the data privacy of Large Language Models (LLMs) and LLM agents: A literature review\",\"authors\":\"Biwei Yan , Kun Li , Minghui Xu , Yueyan Dong , Yue Zhang , Zhaochun Ren , Xiuzhen Cheng\",\"doi\":\"10.1016/j.hcc.2025.100300\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Large Language Models (LLMs) are complex artificial intelligence systems, which can understand, generate, and translate human languages. By analyzing large amounts of textual data, these models learn language patterns to perform tasks such as writing, conversation, and summarization. Agents built on LLMs (LLM agents) further extend these capabilities, allowing them to process user interactions and perform complex operations in diverse task environments. However, during the processing and generation of massive data, LLMs and LLM agents pose a risk of sensitive information leakage, potentially threatening data privacy. This paper aims to demonstrate data privacy issues associated with LLMs and LLM agents to facilitate a comprehensive understanding. Specifically, we conduct an in-depth survey about privacy threats, encompassing passive privacy leakage and active privacy attacks. Subsequently, we introduce the privacy protection mechanisms employed by LLMs and LLM agents and provide a detailed analysis of their effectiveness. Finally, we explore the privacy protection challenges for LLMs and LLM agents as well as outline potential directions for future developments in this domain.</div></div>\",\"PeriodicalId\":100605,\"journal\":{\"name\":\"High-Confidence Computing\",\"volume\":\"5 2\",\"pages\":\"Article 100300\"},\"PeriodicalIF\":3.0000,\"publicationDate\":\"2025-02-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"High-Confidence Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2667295225000042\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"High-Confidence Computing","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2667295225000042","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

大型语言模型（llm）是复杂的人工智能系统，可以理解、生成和翻译人类语言。通过分析大量文本数据，这些模型学习语言模式，以执行诸如写作、对话和总结等任务。构建在LLM （LLM代理）上的代理进一步扩展了这些功能，允许它们处理用户交互并在不同的任务环境中执行复杂的操作。然而，在海量数据的处理和生成过程中，LLM和LLM代理存在敏感信息泄露的风险，可能威胁到数据隐私。本文旨在展示与LLM和LLM代理相关的数据隐私问题，以促进全面理解。具体而言，我们对隐私威胁进行了深入调查，包括被动隐私泄露和主动隐私攻击。随后，我们介绍了法学硕士和法学硕士代理人采用的隐私保护机制，并对其有效性进行了详细的分析。最后，我们探讨了法学硕士和法学硕士代理的隐私保护挑战，并概述了该领域未来发展的潜在方向。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

On protecting the data privacy of Large Language Models (LLMs) and LLM agents: A literature review

Large Language Models (LLMs) are complex artificial intelligence systems, which can understand, generate, and translate human languages. By analyzing large amounts of textual data, these models learn language patterns to perform tasks such as writing, conversation, and summarization. Agents built on LLMs (LLM agents) further extend these capabilities, allowing them to process user interactions and perform complex operations in diverse task environments. However, during the processing and generation of massive data, LLMs and LLM agents pose a risk of sensitive information leakage, potentially threatening data privacy. This paper aims to demonstrate data privacy issues associated with LLMs and LLM agents to facilitate a comprehensive understanding. Specifically, we conduct an in-depth survey about privacy threats, encompassing passive privacy leakage and active privacy attacks. Subsequently, we introduce the privacy protection mechanisms employed by LLMs and LLM agents and provide a detailed analysis of their effectiveness. Finally, we explore the privacy protection challenges for LLMs and LLM agents as well as outline potential directions for future developments in this domain.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

High-Confidence Computing

CiteScore

4.70

自引率

0.00%

发文量