Mohammad Al-Rubaie, Pei-Yuan Wu, J. M. Chang, S. Kung
{"title":"Privacy-preserving PCA on horizontally-partitioned data","authors":"Mohammad Al-Rubaie, Pei-Yuan Wu, J. M. Chang, S. Kung","doi":"10.1109/DESEC.2017.8073817","DOIUrl":null,"url":null,"abstract":"Private data is used on daily basis by a variety of applications where machine learning algorithms predict our shopping patterns and movie preferences among other things. Principal component analysis (PCA) is a widely used method to reduce the dimensionality of data. Reducing the data dimension is essential for data visualization, preventing overfitting and resisting reconstruction attacks. In this paper, we propose methods that would enable the PCA computation to be performed on horizontally-partitioned data among multiple data owners without requiring them to stay online for the execution of the protocol. To address this problem, we propose a new protocol for computing the total scatter matrix using additive homomorphic encryption, and performing the Eigen decomposition using Garbled circuits. Our hybrid protocol does not reveal any of the data owner's input; thus protecting their privacy. We implemented our protocols using Java and Obliv-C, and conducted experiments using public datasets. We show that our protocols are efficient, and preserve the privacy while maintaining the accuracy.","PeriodicalId":92346,"journal":{"name":"DASC-PICom-DataCom-CyberSciTech 2017 : 2017 IEEE 15th International Conference on Dependable, Autonomic and Secure Computing ; 2017 IEEE 15th International Conference on Pervasive Intelligence and Computing ; 2017 IEEE 3rd International...","volume":"88 1","pages":"280-287"},"PeriodicalIF":0.0000,"publicationDate":"2017-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"23","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"DASC-PICom-DataCom-CyberSciTech 2017 : 2017 IEEE 15th International Conference on Dependable, Autonomic and Secure Computing ; 2017 IEEE 15th International Conference on Pervasive Intelligence and Computing ; 2017 IEEE 3rd International...","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DESEC.2017.8073817","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 23
Abstract
Private data is used on daily basis by a variety of applications where machine learning algorithms predict our shopping patterns and movie preferences among other things. Principal component analysis (PCA) is a widely used method to reduce the dimensionality of data. Reducing the data dimension is essential for data visualization, preventing overfitting and resisting reconstruction attacks. In this paper, we propose methods that would enable the PCA computation to be performed on horizontally-partitioned data among multiple data owners without requiring them to stay online for the execution of the protocol. To address this problem, we propose a new protocol for computing the total scatter matrix using additive homomorphic encryption, and performing the Eigen decomposition using Garbled circuits. Our hybrid protocol does not reveal any of the data owner's input; thus protecting their privacy. We implemented our protocols using Java and Obliv-C, and conducted experiments using public datasets. We show that our protocols are efficient, and preserve the privacy while maintaining the accuracy.