多媒体数据安全的大型语言模型：挑战与解决方案综述

IF 4.4 2区计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Computer Networks Pub Date : 2025-05-12 DOI:10.1016/j.comnet.2025.111379

Ankit Kumar , Mikail Mohammed Salim , David Camacho , Jong Hyuk Park

{"title":"多媒体数据安全的大型语言模型：挑战与解决方案综述","authors":"Ankit Kumar , Mikail Mohammed Salim , David Camacho , Jong Hyuk Park","doi":"10.1016/j.comnet.2025.111379","DOIUrl":null,"url":null,"abstract":"<div><div>The rapid expansion of IoT applications utilizes multimedia data integrated with Large Language Models (LLMs) for interpreting digital information by leveraging the capabilities of artificial intelligence (AI) driven neural network systems. These models are extensively used as generative AI tools for data augmentation but data security and privacy remain a fundamental concern associated with LLM model in the digital domain. Traditional security approach shows potential challenges in addressing emerging threats such as adversarial attacks, data poisoning, or privacy breaches, especially in dynamic and resource-constrained IoT environments. Such malicious attacks target the LLM model during the learning and evaluation phase to exploit the vulnerabilities for unauthorized access. The proposed study conducts a comprehensive survey of the transformative potential of LLM models for securing multimedia data offering analysis of their capabilities, challenges, and solutions. The proposed study explores potential security threats and remedies for each type of multimedia data and investigates the various traditional and emerging data protection schemes. The study systematically classifies emerging attacks on LLM models during training and testing phases which include membership attacks, adversarial perturbations, prompt injection, etc. The study also investigates the various robust defense mechanism such as adversarial training, regularization, encryption, etc. The study evaluates the efficiency of potential LLM models such as generative LLM, transformer-based, and other multimodal systems in securing image, text, and video multimedia data highlighting their adaptability and scalability. The proposed survey compares state-of-the-art solutions and underscores the efficiency of LLM-driven mechanisms over traditional approaches in mitigating emerging attacks such as zero-day threats on multimedia data. It ensures real-time compliance with standard regulations like GDPR (General Data Protection Regulation). The proposed work identifies some open challenges including privacy-preserving LLM deployment, black-box interpretability, personalized LLM privacy risk, and cross-model security integration. It also highlights some robust future solutions such as lightweight LLM design and hybrid security frameworks. The proposed work bridges critical research gaps by providing insights into LLM-based emerging techniques to safeguard sensitive data in IoT-based real-world applications.</div></div>","PeriodicalId":50637,"journal":{"name":"Computer Networks","volume":"267 ","pages":"Article 111379"},"PeriodicalIF":4.4000,"publicationDate":"2025-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A comprehensive survey on large language models for multimedia data security: challenges and solutions\",\"authors\":\"Ankit Kumar , Mikail Mohammed Salim , David Camacho , Jong Hyuk Park\",\"doi\":\"10.1016/j.comnet.2025.111379\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The rapid expansion of IoT applications utilizes multimedia data integrated with Large Language Models (LLMs) for interpreting digital information by leveraging the capabilities of artificial intelligence (AI) driven neural network systems. These models are extensively used as generative AI tools for data augmentation but data security and privacy remain a fundamental concern associated with LLM model in the digital domain. Traditional security approach shows potential challenges in addressing emerging threats such as adversarial attacks, data poisoning, or privacy breaches, especially in dynamic and resource-constrained IoT environments. Such malicious attacks target the LLM model during the learning and evaluation phase to exploit the vulnerabilities for unauthorized access. The proposed study conducts a comprehensive survey of the transformative potential of LLM models for securing multimedia data offering analysis of their capabilities, challenges, and solutions. The proposed study explores potential security threats and remedies for each type of multimedia data and investigates the various traditional and emerging data protection schemes. The study systematically classifies emerging attacks on LLM models during training and testing phases which include membership attacks, adversarial perturbations, prompt injection, etc. The study also investigates the various robust defense mechanism such as adversarial training, regularization, encryption, etc. The study evaluates the efficiency of potential LLM models such as generative LLM, transformer-based, and other multimodal systems in securing image, text, and video multimedia data highlighting their adaptability and scalability. The proposed survey compares state-of-the-art solutions and underscores the efficiency of LLM-driven mechanisms over traditional approaches in mitigating emerging attacks such as zero-day threats on multimedia data. It ensures real-time compliance with standard regulations like GDPR (General Data Protection Regulation). The proposed work identifies some open challenges including privacy-preserving LLM deployment, black-box interpretability, personalized LLM privacy risk, and cross-model security integration. It also highlights some robust future solutions such as lightweight LLM design and hybrid security frameworks. The proposed work bridges critical research gaps by providing insights into LLM-based emerging techniques to safeguard sensitive data in IoT-based real-world applications.</div></div>\",\"PeriodicalId\":50637,\"journal\":{\"name\":\"Computer Networks\",\"volume\":\"267 \",\"pages\":\"Article 111379\"},\"PeriodicalIF\":4.4000,\"publicationDate\":\"2025-05-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer Networks\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1389128625003469\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1389128625003469","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

摘要

物联网应用的快速扩展利用多媒体数据与大型语言模型（llm）集成，通过利用人工智能（AI）驱动的神经网络系统的能力来解释数字信息。这些模型被广泛用作数据增强的生成人工智能工具，但数据安全和隐私仍然是数字领域法学硕士模型相关的基本问题。传统的安全方法在应对新兴威胁（如对抗性攻击、数据中毒或隐私泄露）方面存在潜在挑战，特别是在动态和资源受限的物联网环境中。此类恶意攻击在学习和评估阶段针对LLM模型，利用其漏洞进行未经授权的访问。本研究对保护多媒体数据的法学硕士模型的变革潜力进行了全面调查，并对其能力、挑战和解决方案进行了分析。建议的研究探讨了每种类型的多媒体数据的潜在安全威胁和补救措施，并调查了各种传统和新兴的数据保护方案。该研究系统地对LLM模型在训练和测试阶段出现的攻击进行了分类，包括成员攻击、对抗性扰动、提示注入等。研究还探讨了各种强大的防御机制，如对抗训练、正则化、加密等。该研究评估了潜在的LLM模型（如生成式LLM、基于变压器的LLM和其他多模态系统）在保护图像、文本和视频多媒体数据方面的效率，强调了它们的适应性和可扩展性。该调查比较了最先进的解决方案，并强调了法学硕士驱动机制在缓解新兴攻击（如针对多媒体数据的零日威胁）方面的效率。它确保实时遵守GDPR（通用数据保护条例）等标准法规。提出的工作确定了一些开放的挑战，包括保护隐私的LLM部署、黑盒可解释性、个性化LLM隐私风险和跨模型安全集成。它还强调了一些强大的未来解决方案，如轻量级LLM设计和混合安全框架。拟议的工作通过提供对基于llm的新兴技术的见解来弥合关键的研究差距，以保护基于物联网的现实应用中的敏感数据。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A comprehensive survey on large language models for multimedia data security: challenges and solutions

The rapid expansion of IoT applications utilizes multimedia data integrated with Large Language Models (LLMs) for interpreting digital information by leveraging the capabilities of artificial intelligence (AI) driven neural network systems. These models are extensively used as generative AI tools for data augmentation but data security and privacy remain a fundamental concern associated with LLM model in the digital domain. Traditional security approach shows potential challenges in addressing emerging threats such as adversarial attacks, data poisoning, or privacy breaches, especially in dynamic and resource-constrained IoT environments. Such malicious attacks target the LLM model during the learning and evaluation phase to exploit the vulnerabilities for unauthorized access. The proposed study conducts a comprehensive survey of the transformative potential of LLM models for securing multimedia data offering analysis of their capabilities, challenges, and solutions. The proposed study explores potential security threats and remedies for each type of multimedia data and investigates the various traditional and emerging data protection schemes. The study systematically classifies emerging attacks on LLM models during training and testing phases which include membership attacks, adversarial perturbations, prompt injection, etc. The study also investigates the various robust defense mechanism such as adversarial training, regularization, encryption, etc. The study evaluates the efficiency of potential LLM models such as generative LLM, transformer-based, and other multimodal systems in securing image, text, and video multimedia data highlighting their adaptability and scalability. The proposed survey compares state-of-the-art solutions and underscores the efficiency of LLM-driven mechanisms over traditional approaches in mitigating emerging attacks such as zero-day threats on multimedia data. It ensures real-time compliance with standard regulations like GDPR (General Data Protection Regulation). The proposed work identifies some open challenges including privacy-preserving LLM deployment, black-box interpretability, personalized LLM privacy risk, and cross-model security integration. It also highlights some robust future solutions such as lightweight LLM design and hybrid security frameworks. The proposed work bridges critical research gaps by providing insights into LLM-based emerging techniques to safeguard sensitive data in IoT-based real-world applications.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computer Networks 工程技术-电信学

CiteScore

10.80

自引率

3.60%

发文量

434

审稿时长

8.6 months

期刊介绍： Computer Networks is an international, archival journal providing a publication vehicle for complete coverage of all topics of interest to those involved in the computer communications networking area. The audience includes researchers, managers and operators of networks as well as designers and implementors. The Editorial Board will consider any material for publication that is of interest to those groups.