{"title":"MC2: A Secure Collaborative Computation Platform","authors":"R. A. Popa","doi":"10.1145/3411501.3418609","DOIUrl":"https://doi.org/10.1145/3411501.3418609","url":null,"abstract":"Multiple organizations often wish to aggregate their sensitive data and learn from it, but they cannot do so because they cannot share their data. For example, banks wish to train models jointly over their aggregate transaction data to detect money launderers because criminals hide their traces across different banks. To address such problems, my students and I developed MC2, a framework for secure collaborative computation. My talk will overview our MC2 platform, from the technical approach to results and adoption. Biography: Raluca Ada Popa is a computer security professor at UC Berkeley. She is a co-founder and co-director of the RISELab at UC Berkeley, where her research is on systems security and applied cryptography. She is also a co-founder and CTO of a cybersecurity startup called PreVeil. Raluca has received her PhD in computer science as well as her Masters and two BS degrees, in computer science and in mathematics, from MIT. She is the recipient of a Sloan Foundation Fellowship award, NSF Career, Technology Review 35 Innovators under 35, and a George M. Sprowls Award for best MIT CS doctoral thesis.","PeriodicalId":116231,"journal":{"name":"Proceedings of the 2020 Workshop on Privacy-Preserving Machine Learning in Practice","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129960141","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Introduction to Secure Collaborative Intelligence (SCI) Lab","authors":"Pu Duan","doi":"10.1145/3411501.3418606","DOIUrl":"https://doi.org/10.1145/3411501.3418606","url":null,"abstract":"With the rapid development of technology, user privacy and data security are drawing much attention over the recent years. On one hand, how to protect user privacy while making use of customers? data is a challenging task. On the other hand, data silos are becoming one of the most prominent issues for the society. How to bridge these isolated data islands to build better AI and BI systems while meeting the data privacy and regulatory compliance requirements has imposed great challenges. Secure Collaborative Intelligence (SCI) lab at Ant Group dedicates to leverage multiple privacy-preserving technologies on AI and BI to solve these challenges. The goal of SCI lab is to build enterprise-level solutions that allow multiple data owners to achieve joint risk control, joint marketing, joint data analysis and other cross-organization collaboration scenarios without compromising information privacy or violating any related security policy. Compared with other solution providers, SCI lab has been working with top universities and research organizations to build the first privacy-preserving open platform for collaborative intelligence computation in the world. It is the first platform that combines all three cutting-edge privacy-preserving technologies, secure multi-party computation (MPC), differential privacy (DP) and trusted execution environment (TEE) that are based on cryptography, information theory and computer hardware respectively, on multi-party AI and BI collaboration scenarios. During multi-party collaboration, all inputs, computations and results are protected under specific security policy dedicatedly designed for each data owner. At this time, the platform has been applied to various business scenarios in Ant group and Alibaba Group, including joint lending, collaborative data analysis, joint payment fraud detection, etc. More than 20 financial organizations, have been benefited from the secure data collaboration and computing services provided by SCI lab.","PeriodicalId":116231,"journal":{"name":"Proceedings of the 2020 Workshop on Privacy-Preserving Machine Learning in Practice","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134103402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Adversarial Detection on Graph Structured Data","authors":"Jinyin Chen, Huiling Xu, Jinhuan Wang, Qi Xuan, Xuhong Zhang","doi":"10.1145/3411501.3419424","DOIUrl":"https://doi.org/10.1145/3411501.3419424","url":null,"abstract":"Graph Neural Networks (GNNs) has achieved tremendous development on perceptual tasks in recent years, such as node classification, graph classification, link prediction, etc. However, recent studies show that deep learning models of GNNs are incredibly vulnerable to adversarial attacks, so enhancing the robustness of such models remains a significant challenge. In this paper, we propose a subgraph based adversarial sample detection against adversarial perturbations. To the best of our knowledge, this is the first work on the adversarial detection in the deep-learning graph classification models, using the Subgraph Networks (SGN) to restructure the graph's features. Moreover, we develop the joint adversarial detector to cope with the more complicated and unknown attacks. Specifically, we first explain how adversarial attacks can easily fool the models and then show that the SGN can facilitate the distinction of adversarial examples generated by state-of-the-art attacks. We experiment on five real-world graph datasets using three different kinds of attack strategies on graph classification. Our empirical results show the effectiveness of our detection method and further explain the SGN's capacity to tell apart malicious graphs.","PeriodicalId":116231,"journal":{"name":"Proceedings of the 2020 Workshop on Privacy-Preserving Machine Learning in Practice","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115320108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Amos Treiber, Alejandro Molina, Christian Weinert, T. Schneider, K. Kersting
{"title":"CryptoSPN: Expanding PPML beyond Neural Networks","authors":"Amos Treiber, Alejandro Molina, Christian Weinert, T. Schneider, K. Kersting","doi":"10.1145/3411501.3419417","DOIUrl":"https://doi.org/10.1145/3411501.3419417","url":null,"abstract":"The ubiquitous deployment of machine learning (ML) technologies has certainly improved many applications but also raised challenging privacy concerns, as sensitive client data is usually processed remotely at the discretion of a service provider. Therefore, privacy-preserving machine learning (PPML) aims at providing privacy using techniques such as secure multi-party computation (SMPC). Recent years have seen a rapid influx of cryptographic frameworks that steadily improve performance as well as usability, pushing PPML towards practice. However, as it is mainly driven by the crypto community, the PPML toolkit so far is mostly restricted to well-known neural networks (NNs). Unfortunately, deep probabilistic models rising in the ML community that can deal with a wide range of probabilistic queries and offer tractability guarantees are severely underrepresented. Due to a lack of interdisciplinary collaboration, PPML is missing such important trends, ultimately hindering the adoption of privacy technology. In this work, we introduce CryptoSPN, a framework for privacy-preserving inference of sum-product networks (SPNs) to significantly expand the PPML toolkit beyond NNs. SPNs are deep probabilistic models at the sweet-spot between expressivity and tractability, allowing for a range of exact queries in linear time. In an interdisciplinary effort, we combine techniques from both ML and crypto to allow for efficient, privacy-preserving SPN inference via SMPC. We provide CryptoSPN as open source and seamlessly integrate it into the SPFlow library (Molina et al., arXiv 2019) for practical use by ML experts. Our evaluation on a broad range of SPNs demonstrates that CryptoSPN achieves highly efficient and accurate inference within seconds for medium-sized SPNs.","PeriodicalId":116231,"journal":{"name":"Proceedings of the 2020 Workshop on Privacy-Preserving Machine Learning in Practice","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132521792","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Andrew Law, Chester Leung, Rishabh Poddar, R. A. Popa, Chenyu Shi, Octavian Sima, Chaofan Yu, Xingmeng Zhang, Wenting Zheng
{"title":"Secure Collaborative Training and Inference for XGBoost","authors":"Andrew Law, Chester Leung, Rishabh Poddar, R. A. Popa, Chenyu Shi, Octavian Sima, Chaofan Yu, Xingmeng Zhang, Wenting Zheng","doi":"10.1145/3411501.3419420","DOIUrl":"https://doi.org/10.1145/3411501.3419420","url":null,"abstract":"In recent years, gradient boosted decision tree learning has proven to be an effective method of training robust models. Moreover, collaborative learning among multiple parties has the potential to greatly benefit all parties involved, but organizations have also encountered obstacles in sharing sensitive data due to business, regulatory, and liability concerns. We propose Secure XGBoost, a privacy-preserving system that enables multiparty training and inference of XGBoost models. Secure XGBoost protects the privacy of each party's data as well as the integrity of the computation with the help of hardware enclaves. Crucially, Secure XGBoost augments the security of the enclaves using novel data-oblivious algorithms that prevent access side-channel attacks on enclaves induced via access pattern leakage.","PeriodicalId":116231,"journal":{"name":"Proceedings of the 2020 Workshop on Privacy-Preserving Machine Learning in Practice","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122373005","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tom Farrand, FatemehSadat Mireshghallah, Sahib Singh, Andrew Trask
{"title":"Neither Private Nor Fair: Impact of Data Imbalance on Utility and Fairness in Differential Privacy","authors":"Tom Farrand, FatemehSadat Mireshghallah, Sahib Singh, Andrew Trask","doi":"10.1145/3411501.3419419","DOIUrl":"https://doi.org/10.1145/3411501.3419419","url":null,"abstract":"Deployment of deep learning in different fields and industries is growing day by day due to its performance, which relies on the availability of data and compute. Data is often crowd-sourced and contains sensitive information about its contributors, which leaks into models that are trained on it. To achieve rigorous privacy guarantees, differentially private training mechanisms are used. However, it has recently been shown that differential privacy can exacerbate existing biases in the data and have disparate impacts on the accuracy of different subgroups of data. In this paper, we aim to study these effects within differentially private deep learning. Specifically, we aim to study how different levels of imbalance in the data affect the accuracy and the fairness of the decisions made by the model, given different levels of privacy. We demonstrate that even small imbalances and loose privacy guarantees can cause disparate impacts.","PeriodicalId":116231,"journal":{"name":"Proceedings of the 2020 Workshop on Privacy-Preserving Machine Learning in Practice","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128067692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fabian Boemer, Rosario Cammarota, Daniel Demmler, T. Schneider, Hossein Yalame
{"title":"MP2ML: A Mixed-Protocol Machine Learning Framework for Private Inference","authors":"Fabian Boemer, Rosario Cammarota, Daniel Demmler, T. Schneider, Hossein Yalame","doi":"10.1145/3411501.3419425","DOIUrl":"https://doi.org/10.1145/3411501.3419425","url":null,"abstract":"We present an extended abstract of MP2ML, a machine learning framework which integrates Intel nGraph-HE, a homomorphic encryption (HE) framework, and the secure two-party computation framework ABY, to enable data scientists to perform private inference of deep learning (DL) models trained using popular frameworks such as TensorFlow at the push of a button. We benchmark MP2ML on the CryptoNets network with ReLU activations, on which it achieves a throughput of 33.3 images/s and an accuracy of 98.6%. This throughput matches the previous state-of-the-art frameworks.","PeriodicalId":116231,"journal":{"name":"Proceedings of the 2020 Workshop on Privacy-Preserving Machine Learning in Practice","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132731485","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}