Machine Learning for Anomaly Detection and Categorization in Multi-Cloud Environments

2017 IEEE 4th International Conference on Cyber Security and Cloud Computing (CSCloud) Pub Date : 2017-06-01 DOI:10.1109/CSCloud.2017.15

Tara Salman, D. Bhamare, A. Erbad, R. Jain, M. Samaka

{"title":"Machine Learning for Anomaly Detection and Categorization in Multi-Cloud Environments","authors":"Tara Salman, D. Bhamare, A. Erbad, R. Jain, M. Samaka","doi":"10.1109/CSCloud.2017.15","DOIUrl":null,"url":null,"abstract":"Cloud computing has been widely adopted by application service providers (ASPs) and enterprises to reduce both capital expenditures (CAPEX) and operational expenditures (OPEX). Applications and services previously running on private data centers are now being migrated to private or public clouds. Since most of the ASPs and enterprises have globally distributed user bases, their services need to be distributed across multiple clouds, spread across the globe which can achieve better performance in terms of latency, scalability and load balancing. The shift has eventually led the research community to study multi-cloud environments. However, the widespread acceptance of such environments has been hampered by major security concerns. Firewalls and traditional rule-based security protection techniques are not sufficient to protect user-data in multi-cloud scenarios. Recently, advances in machine learning techniques have attracted the attention of the research community to build intrusion detection systems (IDS) that can detect anomalies in the network traffic. Most of the research works, however, do not differentiate among different types of attacks. This is, in fact, necessary for appropriate countermeasures and defense against attacks. In this paper, we investigate both detecting and categorizing anomalies rather than just detecting, which is a common trend in the contemporary research works. We have used a popular publicly available dataset to build and test learning models for both detection and categorization of different attacks. To be precise, we have used two supervised machine learning techniques, namely linear regression (LR) and random forest (RF). We show that even if detection is perfect, categorization can be less accurate due to similarities between attacks. Our results demonstrate more than 99% detection accuracy and categorization accuracy of 93.6%, with the inability to categorize some attacks. Further, we argue that such categorization can be applied to multi-cloud environments using the same machine learning techniques.","PeriodicalId":436299,"journal":{"name":"2017 IEEE 4th International Conference on Cyber Security and Cloud Computing (CSCloud)","volume":"2018 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"74","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE 4th International Conference on Cyber Security and Cloud Computing (CSCloud)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSCloud.2017.15","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 74

Abstract

Cloud computing has been widely adopted by application service providers (ASPs) and enterprises to reduce both capital expenditures (CAPEX) and operational expenditures (OPEX). Applications and services previously running on private data centers are now being migrated to private or public clouds. Since most of the ASPs and enterprises have globally distributed user bases, their services need to be distributed across multiple clouds, spread across the globe which can achieve better performance in terms of latency, scalability and load balancing. The shift has eventually led the research community to study multi-cloud environments. However, the widespread acceptance of such environments has been hampered by major security concerns. Firewalls and traditional rule-based security protection techniques are not sufficient to protect user-data in multi-cloud scenarios. Recently, advances in machine learning techniques have attracted the attention of the research community to build intrusion detection systems (IDS) that can detect anomalies in the network traffic. Most of the research works, however, do not differentiate among different types of attacks. This is, in fact, necessary for appropriate countermeasures and defense against attacks. In this paper, we investigate both detecting and categorizing anomalies rather than just detecting, which is a common trend in the contemporary research works. We have used a popular publicly available dataset to build and test learning models for both detection and categorization of different attacks. To be precise, we have used two supervised machine learning techniques, namely linear regression (LR) and random forest (RF). We show that even if detection is perfect, categorization can be less accurate due to similarities between attacks. Our results demonstrate more than 99% detection accuracy and categorization accuracy of 93.6%, with the inability to categorize some attacks. Further, we argue that such categorization can be applied to multi-cloud environments using the same machine learning techniques.

查看原文本刊更多论文

多云环境下异常检测与分类的机器学习

云计算已被应用服务提供商(asp)和企业广泛采用，以降低资本支出(CAPEX)和运营支出(OPEX)。以前在私有数据中心上运行的应用程序和服务现在正在迁移到私有或公共云。由于大多数asp和企业拥有全球分布的用户群，因此他们的服务需要分布在多个云上，在全球范围内传播，这样可以在延迟、可伸缩性和负载平衡方面实现更好的性能。这种转变最终导致研究界开始研究多云环境。然而，这种环境的广泛接受受到重大安全问题的阻碍。防火墙和传统的基于规则的安全保护技术不足以保护多云场景中的用户数据。近年来，机器学习技术的进步引起了研究界对构建入侵检测系统(IDS)的关注，该系统可以检测网络流量中的异常。然而，大多数研究工作并没有区分不同类型的攻击。事实上，这对于适当的对策和防御攻击是必要的。在本文中，我们研究了异常的检测和分类，而不仅仅是检测，这是当代研究工作的共同趋势。我们使用了一个流行的公开可用数据集来构建和测试用于检测和分类不同攻击的学习模型。准确地说，我们使用了两种监督式机器学习技术，即线性回归(LR)和随机森林(RF)。我们表明，即使检测是完美的，由于攻击之间的相似性，分类也可能不太准确。我们的结果表明，检测准确率超过99%，分类准确率为93.6%，但无法对某些攻击进行分类。此外，我们认为这种分类可以使用相同的机器学习技术应用于多云环境。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2017 IEEE 4th International Conference on Cyber Security and Cloud Computing (CSCloud)

自引率

0.00%

发文量