面向DGA检测的隐私保护分类即服务

2021 18th International Conference on Privacy, Security and Trust (PST) Pub Date : 2021-12-13 DOI:10.1109/PST52912.2021.9647755

Arthur Drichel, M. A. Gurabi, Tim Amelung, Ulrike Meyer

{"title":"面向DGA检测的隐私保护分类即服务","authors":"Arthur Drichel, M. A. Gurabi, Tim Amelung, Ulrike Meyer","doi":"10.1109/PST52912.2021.9647755","DOIUrl":null,"url":null,"abstract":"Domain generation algorithm (DGA) classifiers can be used to detect and block the establishment of a connection between bots and their command-and-control server. Classification-as-a-service (CaaS) can separate the classification of domain names from the need for real-world training data, which are difficult to obtain but mandatory for well performing classifiers. However, domain names as well as trained models may contain privacy-critical information which should not be leaked to either the model provider or the data provider. Several generic frameworks for privacy-preserving machine learning (ML) have been proposed in the past that can preserve data and model privacy. Thus, it seems high time to combine state-of-the-art DGA classifiers and privacy-preservation frameworks to enable privacy-preserving CaaS, preserving both, data and model privacy for the DGA detection use case. In this work, we examine the real-world applicability of four generic frameworks for privacy-preserving ML using different state-of-the-art DGA detection models. Our results show that out-of-the-box DGA detection models are computationally infeasible for privacy-preserving inference in a real-world setting. We propose model simplifications that achieve a reduction in inference latency of up to 95%, and up to 97% in communication complexity while causing an accuracy penalty of less than 0.17%. Despite this significant improvement, real-time classification is still not feasible in a traditional two-party setting. Thus, more efficient secure multi-party computation (SMPC) or homomorphic encryption (HE) schemes are required to enable real-world feasibility of privacy-preserving CaaS for DGA detection.","PeriodicalId":144610,"journal":{"name":"2021 18th International Conference on Privacy, Security and Trust (PST)","volume":"62 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Towards Privacy-Preserving Classification-as-a-Service for DGA Detection\",\"authors\":\"Arthur Drichel, M. A. Gurabi, Tim Amelung, Ulrike Meyer\",\"doi\":\"10.1109/PST52912.2021.9647755\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Domain generation algorithm (DGA) classifiers can be used to detect and block the establishment of a connection between bots and their command-and-control server. Classification-as-a-service (CaaS) can separate the classification of domain names from the need for real-world training data, which are difficult to obtain but mandatory for well performing classifiers. However, domain names as well as trained models may contain privacy-critical information which should not be leaked to either the model provider or the data provider. Several generic frameworks for privacy-preserving machine learning (ML) have been proposed in the past that can preserve data and model privacy. Thus, it seems high time to combine state-of-the-art DGA classifiers and privacy-preservation frameworks to enable privacy-preserving CaaS, preserving both, data and model privacy for the DGA detection use case. In this work, we examine the real-world applicability of four generic frameworks for privacy-preserving ML using different state-of-the-art DGA detection models. Our results show that out-of-the-box DGA detection models are computationally infeasible for privacy-preserving inference in a real-world setting. We propose model simplifications that achieve a reduction in inference latency of up to 95%, and up to 97% in communication complexity while causing an accuracy penalty of less than 0.17%. Despite this significant improvement, real-time classification is still not feasible in a traditional two-party setting. Thus, more efficient secure multi-party computation (SMPC) or homomorphic encryption (HE) schemes are required to enable real-world feasibility of privacy-preserving CaaS for DGA detection.\",\"PeriodicalId\":144610,\"journal\":{\"name\":\"2021 18th International Conference on Privacy, Security and Trust (PST)\",\"volume\":\"62 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 18th International Conference on Privacy, Security and Trust (PST)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/PST52912.2021.9647755\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 18th International Conference on Privacy, Security and Trust (PST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PST52912.2021.9647755","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

域生成算法(DGA)分类器可用于检测和阻止机器人与其命令和控制服务器之间建立连接。分类即服务(CaaS)可以将域名分类与对真实世界训练数据的需求分离开来，真实世界训练数据很难获得，但对于性能良好的分类器来说是必需的。然而，域名以及训练过的模型可能包含隐私关键信息，这些信息不应该泄露给模型提供者或数据提供者。过去已经提出了几个保护隐私的机器学习(ML)通用框架，可以保护数据和模型隐私。因此，现在似乎是时候将最先进的DGA分类器和隐私保护框架结合起来，以启用隐私保护CaaS，为DGA检测用例保留数据和模型隐私。在这项工作中，我们使用不同的最先进的DGA检测模型，研究了四种用于隐私保护ML的通用框架的现实适用性。我们的研究结果表明，在现实世界中，开箱即用的DGA检测模型对于隐私保护推理在计算上是不可行的。我们提出的模型简化实现了高达95%的推理延迟减少，高达97%的通信复杂性减少，同时导致的准确性损失小于0.17%。尽管有了这一重大改进，但在传统的双方设置中，实时分类仍然是不可行的。因此，需要更有效的安全多方计算(SMPC)或同态加密(HE)方案来实现用于DGA检测的隐私保护CaaS的现实可行性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Towards Privacy-Preserving Classification-as-a-Service for DGA Detection

Domain generation algorithm (DGA) classifiers can be used to detect and block the establishment of a connection between bots and their command-and-control server. Classification-as-a-service (CaaS) can separate the classification of domain names from the need for real-world training data, which are difficult to obtain but mandatory for well performing classifiers. However, domain names as well as trained models may contain privacy-critical information which should not be leaked to either the model provider or the data provider. Several generic frameworks for privacy-preserving machine learning (ML) have been proposed in the past that can preserve data and model privacy. Thus, it seems high time to combine state-of-the-art DGA classifiers and privacy-preservation frameworks to enable privacy-preserving CaaS, preserving both, data and model privacy for the DGA detection use case. In this work, we examine the real-world applicability of four generic frameworks for privacy-preserving ML using different state-of-the-art DGA detection models. Our results show that out-of-the-box DGA detection models are computationally infeasible for privacy-preserving inference in a real-world setting. We propose model simplifications that achieve a reduction in inference latency of up to 95%, and up to 97% in communication complexity while causing an accuracy penalty of less than 0.17%. Despite this significant improvement, real-time classification is still not feasible in a traditional two-party setting. Thus, more efficient secure multi-party computation (SMPC) or homomorphic encryption (HE) schemes are required to enable real-world feasibility of privacy-preserving CaaS for DGA detection.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2021 18th International Conference on Privacy, Security and Trust (PST)

自引率

0.00%

发文量