{"title":"基于llm的隐写术中隐蔽通信的多准则语言优化","authors":"Kamil Woźniak , Marek R. Ogiela , Lidia Ogiela","doi":"10.1016/j.asoc.2025.113960","DOIUrl":null,"url":null,"abstract":"<div><div>This paper presents a novel framework for covert communication through secure steganography using large language models (LLMs). Our approach leverages multi-criteria linguistic optimization to encode secret information directly into stylistic features of auto-regressively generated text. This strategy balances embedding capacity with naturalness and coherence. The secret message is partitioned into fixed-size blocks. Each block is embedded into binary stylistic feature vectors via a surjective linear mapping, which introduces redundancy. This redundancy enables the use of a history-aware cost function that selects stylistic vectors to minimize abrupt transitions and preserve fluency across sentences. Candidate sentences are generated by prompting LLMs with contextual and stylistic constraints. Rejection sampling then ensures exact feature matching and high linguistic quality. Experimental evaluation in multiple LLMs, diverse text contexts, and parameter settings demonstrates effective embedding capacities of up to 0.30 bits per token while maintaining strong linguistic naturalness, validated through perplexity, lexical diversity, readability, and a linguistic acceptability metric. Importantly, decoding recovers the full secret with zero error under ideal conditions. This confirms the reliability of the method. The current work focuses on embedding efficiency and imperceptibility. Robustness against active text alterations and formal undetectability assessments remain open challenges for future research. The proposed multi-criteria linguistic optimization framework offers a promising avenue for advanced covert communication by harmonizing secure information embedding with fluent, human-like language generation.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"185 ","pages":"Article 113960"},"PeriodicalIF":6.6000,"publicationDate":"2025-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multi-criteria linguistic optimization for covert communication in secure LLM-based steganography\",\"authors\":\"Kamil Woźniak , Marek R. Ogiela , Lidia Ogiela\",\"doi\":\"10.1016/j.asoc.2025.113960\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>This paper presents a novel framework for covert communication through secure steganography using large language models (LLMs). Our approach leverages multi-criteria linguistic optimization to encode secret information directly into stylistic features of auto-regressively generated text. This strategy balances embedding capacity with naturalness and coherence. The secret message is partitioned into fixed-size blocks. Each block is embedded into binary stylistic feature vectors via a surjective linear mapping, which introduces redundancy. This redundancy enables the use of a history-aware cost function that selects stylistic vectors to minimize abrupt transitions and preserve fluency across sentences. Candidate sentences are generated by prompting LLMs with contextual and stylistic constraints. Rejection sampling then ensures exact feature matching and high linguistic quality. Experimental evaluation in multiple LLMs, diverse text contexts, and parameter settings demonstrates effective embedding capacities of up to 0.30 bits per token while maintaining strong linguistic naturalness, validated through perplexity, lexical diversity, readability, and a linguistic acceptability metric. Importantly, decoding recovers the full secret with zero error under ideal conditions. This confirms the reliability of the method. The current work focuses on embedding efficiency and imperceptibility. Robustness against active text alterations and formal undetectability assessments remain open challenges for future research. The proposed multi-criteria linguistic optimization framework offers a promising avenue for advanced covert communication by harmonizing secure information embedding with fluent, human-like language generation.</div></div>\",\"PeriodicalId\":50737,\"journal\":{\"name\":\"Applied Soft Computing\",\"volume\":\"185 \",\"pages\":\"Article 113960\"},\"PeriodicalIF\":6.6000,\"publicationDate\":\"2025-09-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Soft Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1568494625012736\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Soft Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1568494625012736","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Multi-criteria linguistic optimization for covert communication in secure LLM-based steganography
This paper presents a novel framework for covert communication through secure steganography using large language models (LLMs). Our approach leverages multi-criteria linguistic optimization to encode secret information directly into stylistic features of auto-regressively generated text. This strategy balances embedding capacity with naturalness and coherence. The secret message is partitioned into fixed-size blocks. Each block is embedded into binary stylistic feature vectors via a surjective linear mapping, which introduces redundancy. This redundancy enables the use of a history-aware cost function that selects stylistic vectors to minimize abrupt transitions and preserve fluency across sentences. Candidate sentences are generated by prompting LLMs with contextual and stylistic constraints. Rejection sampling then ensures exact feature matching and high linguistic quality. Experimental evaluation in multiple LLMs, diverse text contexts, and parameter settings demonstrates effective embedding capacities of up to 0.30 bits per token while maintaining strong linguistic naturalness, validated through perplexity, lexical diversity, readability, and a linguistic acceptability metric. Importantly, decoding recovers the full secret with zero error under ideal conditions. This confirms the reliability of the method. The current work focuses on embedding efficiency and imperceptibility. Robustness against active text alterations and formal undetectability assessments remain open challenges for future research. The proposed multi-criteria linguistic optimization framework offers a promising avenue for advanced covert communication by harmonizing secure information embedding with fluent, human-like language generation.
期刊介绍:
Applied Soft Computing is an international journal promoting an integrated view of soft computing to solve real life problems.The focus is to publish the highest quality research in application and convergence of the areas of Fuzzy Logic, Neural Networks, Evolutionary Computing, Rough Sets and other similar techniques to address real world complexities.
Applied Soft Computing is a rolling publication: articles are published as soon as the editor-in-chief has accepted them. Therefore, the web site will continuously be updated with new articles and the publication time will be short.