{"title":"PraVFed:基于表示学习的实用异构垂直联邦学习","authors":"Shuo Wang;Keke Gai;Jing Yu;Zijian Zhang;Liehuang Zhu","doi":"10.1109/TIFS.2025.3530700","DOIUrl":null,"url":null,"abstract":"Vertical federated learning (VFL) provides a privacy-preserving method for machine learning, enabling collaborative training across multiple institutions with vertically distributed data. Existing VFL methods assume that participants passively gain local models of the same structure and communicate with active pary during each training batch. However, due to the heterogeneity of participating institutions, VFL with heterogeneous models for efficient communication is indispensable in real-life scenarios. To address this challenge, we propose a new VFL method called Practical Heterogeneous Vertical Federated Learning via Representation Learning (PraVFed) to support the training of parties with heterogeneous local models and reduce communication costs. Specifically, PraVFed employs weighted aggregation of local embedding values from the passive party to mitigate the influence of heterogeneous local model information on the global model. Furthermore, to safeguard the passive party’s local sample features, we utilize blinding factors to protect its local embedding values. To reduce communication costs, the passive party performs multiple rounds of local pre-model training while preserving label privacy. We conducted a comprehensive theoretical analysis and extensive experimentation to demonstrate that PraVFed reduces communication overhead under heterogeneous models and outperforms other approaches. For example, when the target accuracy is set at 60% under the CINIC10 dataset, the communication cost of PraVFed is reduced by 70.57% compared to the baseline method. Our code is available at <uri>https://github.com/wangshuo105/PraVFed_main</uri>.","PeriodicalId":13492,"journal":{"name":"IEEE Transactions on Information Forensics and Security","volume":"20 ","pages":"2693-2705"},"PeriodicalIF":8.0000,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"PraVFed: Practical Heterogeneous Vertical Federated Learning via Representation Learning\",\"authors\":\"Shuo Wang;Keke Gai;Jing Yu;Zijian Zhang;Liehuang Zhu\",\"doi\":\"10.1109/TIFS.2025.3530700\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Vertical federated learning (VFL) provides a privacy-preserving method for machine learning, enabling collaborative training across multiple institutions with vertically distributed data. Existing VFL methods assume that participants passively gain local models of the same structure and communicate with active pary during each training batch. However, due to the heterogeneity of participating institutions, VFL with heterogeneous models for efficient communication is indispensable in real-life scenarios. To address this challenge, we propose a new VFL method called Practical Heterogeneous Vertical Federated Learning via Representation Learning (PraVFed) to support the training of parties with heterogeneous local models and reduce communication costs. Specifically, PraVFed employs weighted aggregation of local embedding values from the passive party to mitigate the influence of heterogeneous local model information on the global model. Furthermore, to safeguard the passive party’s local sample features, we utilize blinding factors to protect its local embedding values. To reduce communication costs, the passive party performs multiple rounds of local pre-model training while preserving label privacy. We conducted a comprehensive theoretical analysis and extensive experimentation to demonstrate that PraVFed reduces communication overhead under heterogeneous models and outperforms other approaches. For example, when the target accuracy is set at 60% under the CINIC10 dataset, the communication cost of PraVFed is reduced by 70.57% compared to the baseline method. Our code is available at <uri>https://github.com/wangshuo105/PraVFed_main</uri>.\",\"PeriodicalId\":13492,\"journal\":{\"name\":\"IEEE Transactions on Information Forensics and Security\",\"volume\":\"20 \",\"pages\":\"2693-2705\"},\"PeriodicalIF\":8.0000,\"publicationDate\":\"2025-01-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Information Forensics and Security\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10843771/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Information Forensics and Security","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10843771/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
PraVFed: Practical Heterogeneous Vertical Federated Learning via Representation Learning
Vertical federated learning (VFL) provides a privacy-preserving method for machine learning, enabling collaborative training across multiple institutions with vertically distributed data. Existing VFL methods assume that participants passively gain local models of the same structure and communicate with active pary during each training batch. However, due to the heterogeneity of participating institutions, VFL with heterogeneous models for efficient communication is indispensable in real-life scenarios. To address this challenge, we propose a new VFL method called Practical Heterogeneous Vertical Federated Learning via Representation Learning (PraVFed) to support the training of parties with heterogeneous local models and reduce communication costs. Specifically, PraVFed employs weighted aggregation of local embedding values from the passive party to mitigate the influence of heterogeneous local model information on the global model. Furthermore, to safeguard the passive party’s local sample features, we utilize blinding factors to protect its local embedding values. To reduce communication costs, the passive party performs multiple rounds of local pre-model training while preserving label privacy. We conducted a comprehensive theoretical analysis and extensive experimentation to demonstrate that PraVFed reduces communication overhead under heterogeneous models and outperforms other approaches. For example, when the target accuracy is set at 60% under the CINIC10 dataset, the communication cost of PraVFed is reduced by 70.57% compared to the baseline method. Our code is available at https://github.com/wangshuo105/PraVFed_main.
期刊介绍:
The IEEE Transactions on Information Forensics and Security covers the sciences, technologies, and applications relating to information forensics, information security, biometrics, surveillance and systems applications that incorporate these features