{"title":"Lightweight multi-layered de-identification architecture: Secure client selection in federated learning","authors":"Jiheon Choi, Sangyoon Oh","doi":"10.1016/j.sysarc.2025.103569","DOIUrl":null,"url":null,"abstract":"<div><div>Despite federated learning (FL) let us avoid raw data-sharing, recent studies shows that existing client selection schemes poses privacy vulnerabilities that might be led to organizational identity leaks. To address this problem, we present a lightweight, multi-layered de-identification architecture that enables privacy-preserving client selection while maintaining selection efficiency in FL environments. The architecture comprises lightweight dynamic hash-based identifier, a secure salt distribution protocol, and an enhanced, resizable Bloom filter verifier. Together, these components provide three critical security properties, formal k-anonymity, <span><math><mi>ϵ</mi></math></span>-connection resistance, and <span><math><mi>α</mi></math></span>-membership confusion, protecting FL training against dictionary attacks, cross-round linkability attacks, and membership-inference attacks. We also offer a theoretical framework that allows fine-grained control of the privacy-efficiency trade-off through adjustable parameters. Empirical experiment on MNIST, Fashion-MNIST, CIFAR-10) with Non-IID datasets and heterogeneous client environments show that our method keeps model accuracy within 0.02 to 15.3 pp of FedAvg and Power-of-Choice while adding only 3.6% to 6.3% of computation overhead. These results demonstrate the effectiveness of our approach and show that balanced client selection between performance and privacy is achievable.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"168 ","pages":"Article 103569"},"PeriodicalIF":4.1000,"publicationDate":"2025-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Systems Architecture","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1383762125002413","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
Despite federated learning (FL) let us avoid raw data-sharing, recent studies shows that existing client selection schemes poses privacy vulnerabilities that might be led to organizational identity leaks. To address this problem, we present a lightweight, multi-layered de-identification architecture that enables privacy-preserving client selection while maintaining selection efficiency in FL environments. The architecture comprises lightweight dynamic hash-based identifier, a secure salt distribution protocol, and an enhanced, resizable Bloom filter verifier. Together, these components provide three critical security properties, formal k-anonymity, -connection resistance, and -membership confusion, protecting FL training against dictionary attacks, cross-round linkability attacks, and membership-inference attacks. We also offer a theoretical framework that allows fine-grained control of the privacy-efficiency trade-off through adjustable parameters. Empirical experiment on MNIST, Fashion-MNIST, CIFAR-10) with Non-IID datasets and heterogeneous client environments show that our method keeps model accuracy within 0.02 to 15.3 pp of FedAvg and Power-of-Choice while adding only 3.6% to 6.3% of computation overhead. These results demonstrate the effectiveness of our approach and show that balanced client selection between performance and privacy is achievable.
期刊介绍:
The Journal of Systems Architecture: Embedded Software Design (JSA) is a journal covering all design and architectural aspects related to embedded systems and software. It ranges from the microarchitecture level via the system software level up to the application-specific architecture level. Aspects such as real-time systems, operating systems, FPGA programming, programming languages, communications (limited to analysis and the software stack), mobile systems, parallel and distributed architectures as well as additional subjects in the computer and system architecture area will fall within the scope of this journal. Technology will not be a main focus, but its use and relevance to particular designs will be. Case studies are welcome but must contribute more than just a design for a particular piece of software.
Design automation of such systems including methodologies, techniques and tools for their design as well as novel designs of software components fall within the scope of this journal. Novel applications that use embedded systems are also central in this journal. While hardware is not a part of this journal hardware/software co-design methods that consider interplay between software and hardware components with and emphasis on software are also relevant here.