Ying Liu Ying Liu, Yong Li Ying Liu, Ming Wen Yong Li, Wenjing Zhang Ming Wen
{"title":"通过基准分析优化联合软件缺陷预测的隐私保护","authors":"Ying Liu Ying Liu, Yong Li Ying Liu, Ming Wen Yong Li, Wenjing Zhang Ming Wen","doi":"10.53106/160792642023112406001","DOIUrl":null,"url":null,"abstract":"Federated learning is a privacy-preserving machine learning technique that coordinates multi-participant co-modeling. It can alleviate the privacy issues of software defect prediction, which is an important technical way to ensure software quality. In this work, we implement Federated Software Defect Prediction (FedSDP) and optimize its privacy issues while guaranteeing performance. We first construct a new benchmark to study the performance and privacy of Federated Software defect prediction. The benchmark consists of (1) 12 NASA software defect datasets, which are all real software defect datasets from different projects in different domains, (2) Horizontal federated learning scenarios, and (3) the Federated Software Defect Prediction algorithm (FedSDP). Benchmark analysis shows that FedSDP provides additional privacy protection and security with guaranteed model performance compared to local training. It also reveals that FedSDP introduces a large amount of model parameter computation and exchange during the training process. There are model user threats and attack challenges from unreliable participants. To provide more reliable privacy protection without losing prediction performance we proposed optimization methods that use homomorphic encryption model parameters to resist honest but curious participants. Experimental results show that our approach achieves more reliable privacy protection with excellent performance on all datasets.","PeriodicalId":442331,"journal":{"name":"網際網路技術學刊","volume":"20 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Privacy Protection Optimization for Federated Software Defect Prediction via Benchmark Analysis\",\"authors\":\"Ying Liu Ying Liu, Yong Li Ying Liu, Ming Wen Yong Li, Wenjing Zhang Ming Wen\",\"doi\":\"10.53106/160792642023112406001\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Federated learning is a privacy-preserving machine learning technique that coordinates multi-participant co-modeling. It can alleviate the privacy issues of software defect prediction, which is an important technical way to ensure software quality. In this work, we implement Federated Software Defect Prediction (FedSDP) and optimize its privacy issues while guaranteeing performance. We first construct a new benchmark to study the performance and privacy of Federated Software defect prediction. The benchmark consists of (1) 12 NASA software defect datasets, which are all real software defect datasets from different projects in different domains, (2) Horizontal federated learning scenarios, and (3) the Federated Software Defect Prediction algorithm (FedSDP). Benchmark analysis shows that FedSDP provides additional privacy protection and security with guaranteed model performance compared to local training. It also reveals that FedSDP introduces a large amount of model parameter computation and exchange during the training process. There are model user threats and attack challenges from unreliable participants. To provide more reliable privacy protection without losing prediction performance we proposed optimization methods that use homomorphic encryption model parameters to resist honest but curious participants. Experimental results show that our approach achieves more reliable privacy protection with excellent performance on all datasets.\",\"PeriodicalId\":442331,\"journal\":{\"name\":\"網際網路技術學刊\",\"volume\":\"20 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"網際網路技術學刊\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.53106/160792642023112406001\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"網際網路技術學刊","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.53106/160792642023112406001","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
联合学习是一种保护隐私的机器学习技术,可协调多方共同建模。它可以缓解软件缺陷预测的隐私问题,而软件缺陷预测是确保软件质量的重要技术手段。在这项工作中,我们实现了联合软件缺陷预测(FedSDP),并在保证性能的同时优化了其隐私问题。我们首先构建了一个新的基准来研究联邦软件缺陷预测的性能和隐私问题。该基准包括:(1)12 个 NASA 软件缺陷数据集,它们都是来自不同领域不同项目的真实软件缺陷数据集;(2)水平联合学习场景;(3)联合软件缺陷预测算法(FedSDP)。基准分析表明,与本地训练相比,FedSDP 提供了额外的隐私保护和安全性,并保证了模型性能。它还显示,FedSDP 在训练过程中引入了大量的模型参数计算和交换。这其中存在模型用户威胁和来自不可靠参与者的攻击挑战。为了在不损失预测性能的情况下提供更可靠的隐私保护,我们提出了使用同态加密模型参数来抵御诚实但好奇的参与者的优化方法。实验结果表明,我们的方法实现了更可靠的隐私保护,在所有数据集上都表现出色。
Privacy Protection Optimization for Federated Software Defect Prediction via Benchmark Analysis
Federated learning is a privacy-preserving machine learning technique that coordinates multi-participant co-modeling. It can alleviate the privacy issues of software defect prediction, which is an important technical way to ensure software quality. In this work, we implement Federated Software Defect Prediction (FedSDP) and optimize its privacy issues while guaranteeing performance. We first construct a new benchmark to study the performance and privacy of Federated Software defect prediction. The benchmark consists of (1) 12 NASA software defect datasets, which are all real software defect datasets from different projects in different domains, (2) Horizontal federated learning scenarios, and (3) the Federated Software Defect Prediction algorithm (FedSDP). Benchmark analysis shows that FedSDP provides additional privacy protection and security with guaranteed model performance compared to local training. It also reveals that FedSDP introduces a large amount of model parameter computation and exchange during the training process. There are model user threats and attack challenges from unreliable participants. To provide more reliable privacy protection without losing prediction performance we proposed optimization methods that use homomorphic encryption model parameters to resist honest but curious participants. Experimental results show that our approach achieves more reliable privacy protection with excellent performance on all datasets.