{"title":"通过合成数据和零信任集成加强联合特征选择","authors":"Nisha Thorakkattu Madathil;Saed Alrabaee;Abdelkader Nasreddine Belkacem","doi":"10.1109/JSAC.2025.3560037","DOIUrl":null,"url":null,"abstract":"Federated Learning (FL) allows healthcare organizations to train models using diverse datasets while maintaining patient confidentiality collaboratively. While promising, FL faces challenges in optimizing model accuracy and communication efficiency. To address these, we propose an algorithm that combines feature selection with synthetic data generation, specifically targeting medical datasets. Our method eliminates irrelevant local features, identifies globally relevant ones, and uses synthetic data to initialize model parameters, improving convergence. It also employs a zero-trust model, ensuring that data remain on local devices and only learned weights are shared with the central server, enhancing security. The algorithm improves accuracy and computational efficiency, achieving communication efficiency gains of 4 to 14 through backward elimination and threshold variation techniques. Tested on a federated diabetic dataset, the approach demonstrates significant improvements in the performance and trustworthiness of FL systems for medical applications.","PeriodicalId":73294,"journal":{"name":"IEEE journal on selected areas in communications : a publication of the IEEE Communications Society","volume":"43 6","pages":"2126-2140"},"PeriodicalIF":17.2000,"publicationDate":"2025-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Enhancing Federated Feature Selection Through Synthetic Data and Zero Trust Integration\",\"authors\":\"Nisha Thorakkattu Madathil;Saed Alrabaee;Abdelkader Nasreddine Belkacem\",\"doi\":\"10.1109/JSAC.2025.3560037\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Federated Learning (FL) allows healthcare organizations to train models using diverse datasets while maintaining patient confidentiality collaboratively. While promising, FL faces challenges in optimizing model accuracy and communication efficiency. To address these, we propose an algorithm that combines feature selection with synthetic data generation, specifically targeting medical datasets. Our method eliminates irrelevant local features, identifies globally relevant ones, and uses synthetic data to initialize model parameters, improving convergence. It also employs a zero-trust model, ensuring that data remain on local devices and only learned weights are shared with the central server, enhancing security. The algorithm improves accuracy and computational efficiency, achieving communication efficiency gains of 4 to 14 through backward elimination and threshold variation techniques. Tested on a federated diabetic dataset, the approach demonstrates significant improvements in the performance and trustworthiness of FL systems for medical applications.\",\"PeriodicalId\":73294,\"journal\":{\"name\":\"IEEE journal on selected areas in communications : a publication of the IEEE Communications Society\",\"volume\":\"43 6\",\"pages\":\"2126-2140\"},\"PeriodicalIF\":17.2000,\"publicationDate\":\"2025-04-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE journal on selected areas in communications : a publication of the IEEE Communications Society\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10963976/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE journal on selected areas in communications : a publication of the IEEE Communications Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10963976/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Enhancing Federated Feature Selection Through Synthetic Data and Zero Trust Integration
Federated Learning (FL) allows healthcare organizations to train models using diverse datasets while maintaining patient confidentiality collaboratively. While promising, FL faces challenges in optimizing model accuracy and communication efficiency. To address these, we propose an algorithm that combines feature selection with synthetic data generation, specifically targeting medical datasets. Our method eliminates irrelevant local features, identifies globally relevant ones, and uses synthetic data to initialize model parameters, improving convergence. It also employs a zero-trust model, ensuring that data remain on local devices and only learned weights are shared with the central server, enhancing security. The algorithm improves accuracy and computational efficiency, achieving communication efficiency gains of 4 to 14 through backward elimination and threshold variation techniques. Tested on a federated diabetic dataset, the approach demonstrates significant improvements in the performance and trustworthiness of FL systems for medical applications.