Cache-Based Application Detection in the Cloud Using Machine Learning

Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security Pub Date : 2017-04-02 DOI:10.1145/3052973.3053036

Berk Gülmezoglu, T. Eisenbarth, B. Sunar

{"title":"Cache-Based Application Detection in the Cloud Using Machine Learning","authors":"Berk Gülmezoglu, T. Eisenbarth, B. Sunar","doi":"10.1145/3052973.3053036","DOIUrl":null,"url":null,"abstract":"Cross-VM attacks have emerged as a major threat on commercial clouds. These attacks commonly exploit hardware level leakages on shared physical servers. A co-located machine can readily feel the presence of a co-located instance with a heavy computational load through performance degradation due to contention on shared resources. Shared cache architectures such as the last level cache (LLC) have become a popular leakage source to mount cross-VM attack. By exploiting LLC leakages, researchers have already shown that it is possible to recover fine grain information such as cryptographic keys from popular software libraries. This makes it essential to verify implementations that handle sensitive data across the many versions and numerous target platforms, a task too complicated, error prone and costly to be handled by human beings. Here we propose a machine learning based technique to classify applications according to their cache access profiles. We show that with minimal and simple manual processing steps feature vectors can be used to train models using support vector machines to classify the applications with a high degree of success. The profiling and training steps are completely automated and do not require any inspection or study of the code to be classified. In native execution, we achieve a successful classification rate as high as 98% (L1 cache) and 78\\% (LLC) over 40 benchmark applications in the Phoronix suite with mild training. In the cross-VM setting on the noisy Amazon EC2 the success rate drops to 60\\% for a suite of 25 applications. With this initial study we demonstrate that it is possible to train meaningful models to successfully predict applications running in co-located instances.","PeriodicalId":20540,"journal":{"name":"Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security","volume":"19 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2017-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"22","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3052973.3053036","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 22

Abstract

Cross-VM attacks have emerged as a major threat on commercial clouds. These attacks commonly exploit hardware level leakages on shared physical servers. A co-located machine can readily feel the presence of a co-located instance with a heavy computational load through performance degradation due to contention on shared resources. Shared cache architectures such as the last level cache (LLC) have become a popular leakage source to mount cross-VM attack. By exploiting LLC leakages, researchers have already shown that it is possible to recover fine grain information such as cryptographic keys from popular software libraries. This makes it essential to verify implementations that handle sensitive data across the many versions and numerous target platforms, a task too complicated, error prone and costly to be handled by human beings. Here we propose a machine learning based technique to classify applications according to their cache access profiles. We show that with minimal and simple manual processing steps feature vectors can be used to train models using support vector machines to classify the applications with a high degree of success. The profiling and training steps are completely automated and do not require any inspection or study of the code to be classified. In native execution, we achieve a successful classification rate as high as 98% (L1 cache) and 78\% (LLC) over 40 benchmark applications in the Phoronix suite with mild training. In the cross-VM setting on the noisy Amazon EC2 the success rate drops to 60\% for a suite of 25 applications. With this initial study we demonstrate that it is possible to train meaningful models to successfully predict applications running in co-located instances.

查看原文本刊更多论文

使用机器学习的云端基于缓存的应用程序检测

跨虚拟机攻击已经成为商业云上的主要威胁。这些攻击通常利用共享物理服务器上的硬件级泄漏。由于共享资源上的争用导致性能下降，共定位的机器很容易感觉到具有沉重计算负载的共定位实例的存在。最后一级缓存(last level cache, LLC)等共享缓存架构已成为跨虚拟机攻击的常见泄漏源。通过利用LLC漏洞，研究人员已经证明，从流行的软件库中恢复加密密钥等细粒度信息是可能的。这使得验证跨多个版本和众多目标平台处理敏感数据的实现变得至关重要，这是一项过于复杂、容易出错且成本高昂的任务，无法由人工处理。在这里，我们提出了一种基于机器学习的技术，根据它们的缓存访问配置文件对应用程序进行分类。我们表明，通过最小和简单的手动处理步骤，可以使用特征向量来训练模型，使用支持向量机对应用程序进行分类，并取得了很高的成功。分析和训练步骤是完全自动化的，不需要对代码进行任何检查或研究就可以进行分类。在本机执行中，我们在Phoronix套件中通过轻度训练实现了高达98% (L1缓存)和78% (LLC)的成功分类率。在嘈杂的Amazon EC2上的跨虚拟机设置中，对于包含25个应用程序的套件，成功率下降到60%。通过这一初步研究，我们证明了训练有意义的模型来成功预测在同址实例中运行的应用程序是可能的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security

自引率

0.00%

发文量