Engineering Privacy-Preserving Machine Learning Protocols

T. Schneider
{"title":"Engineering Privacy-Preserving Machine Learning Protocols","authors":"T. Schneider","doi":"10.1145/3411501.3418607","DOIUrl":null,"url":null,"abstract":"Privacy-preserving machine learning (PPML) protocols allow to privately evaluate or even train machine learning (ML) models on sensitive data while simultaneously protecting the data and the model. So far, most of these protocols were built and optimized by hand, which requires expert knowledge in cryptography and also a thorough understanding of the ML models. Moreover, the design space is very large as there are many technologies that can even be combined with several trade-offs. Examples for the underlying cryptographic building blocks include homomorphic encryption (HE) where computation typically is the bottleneck, and secure multi-party computation protocols (MPC) that rely mostly on symmetric key cryptography where communication is often the~bottleneck. In this keynote, I will describe our research towards engineering practical PPML protocols that protect models and data. First of all, there is no point in designing PPML protocols for too simple models such as Support Vector Machines (SVMs) or Support Vector Regression Machines (SVRs), because they can be stolen easily [10] and hence do not benefit from protection. Complex models can be protected and evaluated in real-time using Trusted Execution Environments (TEEs) which we demonstrated for speech recognition using Intel SGX[5] and for keyword recognition using ARM TrustZone[3] as respective commercial TEE technologies. Our goal is to build tools for non-experts in cryptography to automatically generate highly optimized mixed PPML protocols given a high-level specification in a ML framework like TensorFlow. Towards this, we have built tools to automatically generate optimized mixed protocols that combine HE and different MPC protocols [6-8]. Such mixed protocols can for example be used for the efficient privacy-preserving evaluation of decision trees [1, 2, 9, 13] and neural networks[2, 11, 12]. The first PPML protocols for these ML classifiers were proposed long before the current hype on PPML started [1, 2, 12]. We already have first results for compiling high-level ML specifications via our tools into mixed protocols for neural networks (from TensorFlow) [4] and sum-product networks (from SPFlow) [14], and I will conclude with major open challenges.","PeriodicalId":116231,"journal":{"name":"Proceedings of the 2020 Workshop on Privacy-Preserving Machine Learning in Practice","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2020 Workshop on Privacy-Preserving Machine Learning in Practice","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3411501.3418607","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Privacy-preserving machine learning (PPML) protocols allow to privately evaluate or even train machine learning (ML) models on sensitive data while simultaneously protecting the data and the model. So far, most of these protocols were built and optimized by hand, which requires expert knowledge in cryptography and also a thorough understanding of the ML models. Moreover, the design space is very large as there are many technologies that can even be combined with several trade-offs. Examples for the underlying cryptographic building blocks include homomorphic encryption (HE) where computation typically is the bottleneck, and secure multi-party computation protocols (MPC) that rely mostly on symmetric key cryptography where communication is often the~bottleneck. In this keynote, I will describe our research towards engineering practical PPML protocols that protect models and data. First of all, there is no point in designing PPML protocols for too simple models such as Support Vector Machines (SVMs) or Support Vector Regression Machines (SVRs), because they can be stolen easily [10] and hence do not benefit from protection. Complex models can be protected and evaluated in real-time using Trusted Execution Environments (TEEs) which we demonstrated for speech recognition using Intel SGX[5] and for keyword recognition using ARM TrustZone[3] as respective commercial TEE technologies. Our goal is to build tools for non-experts in cryptography to automatically generate highly optimized mixed PPML protocols given a high-level specification in a ML framework like TensorFlow. Towards this, we have built tools to automatically generate optimized mixed protocols that combine HE and different MPC protocols [6-8]. Such mixed protocols can for example be used for the efficient privacy-preserving evaluation of decision trees [1, 2, 9, 13] and neural networks[2, 11, 12]. The first PPML protocols for these ML classifiers were proposed long before the current hype on PPML started [1, 2, 12]. We already have first results for compiling high-level ML specifications via our tools into mixed protocols for neural networks (from TensorFlow) [4] and sum-product networks (from SPFlow) [14], and I will conclude with major open challenges.
工程隐私保护机器学习协议
隐私保护机器学习(PPML)协议允许私下评估甚至训练敏感数据上的机器学习(ML)模型,同时保护数据和模型。到目前为止,大多数这些协议都是手工构建和优化的,这需要密码学方面的专业知识以及对ML模型的透彻理解。此外,设计空间非常大,因为有许多技术甚至可以与几种权衡相结合。底层加密构建块的示例包括同态加密(HE),其中计算通常是瓶颈,以及主要依赖于对称密钥加密的安全多方计算协议(MPC),其中通信通常是瓶颈。在这个主题演讲中,我将描述我们在保护模型和数据的工程实用PPML协议方面的研究。首先,为过于简单的模型(如支持向量机(svm)或支持向量回归机(svr))设计PPML协议是没有意义的,因为它们很容易被盗[10],因此不能从保护中受益。复杂的模型可以使用可信执行环境(TEE)进行实时保护和评估,我们演示了使用英特尔SGX[5]的语音识别和使用ARM TrustZone[3]的关键字识别作为各自的商业TEE技术。我们的目标是为非密码学专家构建工具,在像TensorFlow这样的ML框架中提供高级规范,自动生成高度优化的混合PPML协议。为此,我们构建了工具来自动生成优化的混合协议,将HE和不同的MPC协议结合在一起[6-8]。例如,这种混合协议可用于决策树[1,2,9,13]和神经网络[2,11,12]的有效隐私保护评估。这些ML分类器的第一个PPML协议早在当前对PPML的炒作开始之前就被提出了[1,2,12]。我们已经有了通过我们的工具将高级机器学习规范编译成神经网络(来自TensorFlow)[4]和和积网络(来自SPFlow)[14]的混合协议的第一批结果,我将以主要的公开挑战来结束。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信