Zhenhua Liu , Han Liang , Jinhua Wang , Baocang Wang
{"title":"Third-party private set intersection with application to privacy-preserving training of large language models","authors":"Zhenhua Liu , Han Liang , Jinhua Wang , Baocang Wang","doi":"10.1016/j.jisa.2025.104061","DOIUrl":null,"url":null,"abstract":"<div><div>In the training of large language models (LLMs), the protection of private dataset is especially crucial. The private set intersection (PSI) mechanism acts as a potent privacy-preserving collaborative learning technique, allowing participants to collaborate in model training without revealing their own data, and thereby meeting the training requirements of LLMs. In this paper, we consider a variant of PSI, namely third-party PSI, where a third-party with no input privately receives the intersection of the other two parties’ sets, while the two parties output nothing. We propose a general construction of third-party PSI protocol from leveled fully homomorphic encryption, which ensures privacy-preserving training of large language models. The proposed construction can support intersection of arbitrary-length items by using polynomial links, and its security can be proven in the presence of semi-honest adversaries. Compared with existing protocols, the instantiation of the proposed general construction achieves higher computational efficiency while maintaining equivalent level of communication complexity. More importantly, the proposed protocol offers better utility, effectively safeguarding the privacy of the data without compromising model accuracy.</div></div>","PeriodicalId":48638,"journal":{"name":"Journal of Information Security and Applications","volume":"91 ","pages":"Article 104061"},"PeriodicalIF":3.8000,"publicationDate":"2025-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Information Security and Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2214212625000985","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
In the training of large language models (LLMs), the protection of private dataset is especially crucial. The private set intersection (PSI) mechanism acts as a potent privacy-preserving collaborative learning technique, allowing participants to collaborate in model training without revealing their own data, and thereby meeting the training requirements of LLMs. In this paper, we consider a variant of PSI, namely third-party PSI, where a third-party with no input privately receives the intersection of the other two parties’ sets, while the two parties output nothing. We propose a general construction of third-party PSI protocol from leveled fully homomorphic encryption, which ensures privacy-preserving training of large language models. The proposed construction can support intersection of arbitrary-length items by using polynomial links, and its security can be proven in the presence of semi-honest adversaries. Compared with existing protocols, the instantiation of the proposed general construction achieves higher computational efficiency while maintaining equivalent level of communication complexity. More importantly, the proposed protocol offers better utility, effectively safeguarding the privacy of the data without compromising model accuracy.
期刊介绍:
Journal of Information Security and Applications (JISA) focuses on the original research and practice-driven applications with relevance to information security and applications. JISA provides a common linkage between a vibrant scientific and research community and industry professionals by offering a clear view on modern problems and challenges in information security, as well as identifying promising scientific and "best-practice" solutions. JISA issues offer a balance between original research work and innovative industrial approaches by internationally renowned information security experts and researchers.