Biwei Yan , Kun Li , Minghui Xu , Yueyan Dong , Yue Zhang , Zhaochun Ren , Xiuzhen Cheng
{"title":"大型语言模型(LLM)及其代理的数据隐私保护:文献综述","authors":"Biwei Yan , Kun Li , Minghui Xu , Yueyan Dong , Yue Zhang , Zhaochun Ren , Xiuzhen Cheng","doi":"10.1016/j.hcc.2025.100300","DOIUrl":null,"url":null,"abstract":"<div><div>Large Language Models (LLMs) are complex artificial intelligence systems, which can understand, generate, and translate human languages. By analyzing large amounts of textual data, these models learn language patterns to perform tasks such as writing, conversation, and summarization. Agents built on LLMs (LLM agents) further extend these capabilities, allowing them to process user interactions and perform complex operations in diverse task environments. However, during the processing and generation of massive data, LLMs and LLM agents pose a risk of sensitive information leakage, potentially threatening data privacy. This paper aims to demonstrate data privacy issues associated with LLMs and LLM agents to facilitate a comprehensive understanding. Specifically, we conduct an in-depth survey about privacy threats, encompassing passive privacy leakage and active privacy attacks. Subsequently, we introduce the privacy protection mechanisms employed by LLMs and LLM agents and provide a detailed analysis of their effectiveness. Finally, we explore the privacy protection challenges for LLMs and LLM agents as well as outline potential directions for future developments in this domain.</div></div>","PeriodicalId":100605,"journal":{"name":"High-Confidence Computing","volume":"5 2","pages":"Article 100300"},"PeriodicalIF":3.2000,"publicationDate":"2025-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"On protecting the data privacy of Large Language Models (LLMs) and LLM agents: A literature review\",\"authors\":\"Biwei Yan , Kun Li , Minghui Xu , Yueyan Dong , Yue Zhang , Zhaochun Ren , Xiuzhen Cheng\",\"doi\":\"10.1016/j.hcc.2025.100300\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Large Language Models (LLMs) are complex artificial intelligence systems, which can understand, generate, and translate human languages. By analyzing large amounts of textual data, these models learn language patterns to perform tasks such as writing, conversation, and summarization. Agents built on LLMs (LLM agents) further extend these capabilities, allowing them to process user interactions and perform complex operations in diverse task environments. However, during the processing and generation of massive data, LLMs and LLM agents pose a risk of sensitive information leakage, potentially threatening data privacy. This paper aims to demonstrate data privacy issues associated with LLMs and LLM agents to facilitate a comprehensive understanding. Specifically, we conduct an in-depth survey about privacy threats, encompassing passive privacy leakage and active privacy attacks. Subsequently, we introduce the privacy protection mechanisms employed by LLMs and LLM agents and provide a detailed analysis of their effectiveness. Finally, we explore the privacy protection challenges for LLMs and LLM agents as well as outline potential directions for future developments in this domain.</div></div>\",\"PeriodicalId\":100605,\"journal\":{\"name\":\"High-Confidence Computing\",\"volume\":\"5 2\",\"pages\":\"Article 100300\"},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2025-02-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"High-Confidence Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2667295225000042\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"High-Confidence Computing","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2667295225000042","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
On protecting the data privacy of Large Language Models (LLMs) and LLM agents: A literature review
Large Language Models (LLMs) are complex artificial intelligence systems, which can understand, generate, and translate human languages. By analyzing large amounts of textual data, these models learn language patterns to perform tasks such as writing, conversation, and summarization. Agents built on LLMs (LLM agents) further extend these capabilities, allowing them to process user interactions and perform complex operations in diverse task environments. However, during the processing and generation of massive data, LLMs and LLM agents pose a risk of sensitive information leakage, potentially threatening data privacy. This paper aims to demonstrate data privacy issues associated with LLMs and LLM agents to facilitate a comprehensive understanding. Specifically, we conduct an in-depth survey about privacy threats, encompassing passive privacy leakage and active privacy attacks. Subsequently, we introduce the privacy protection mechanisms employed by LLMs and LLM agents and provide a detailed analysis of their effectiveness. Finally, we explore the privacy protection challenges for LLMs and LLM agents as well as outline potential directions for future developments in this domain.