Model-based Cluster Analysis for Identifying Suspicious Activity Sequences in Software

Proceedings of the 3rd ACM on International Workshop on Security And Privacy Analytics Pub Date : 2017-03-24 DOI:10.1145/3041008.3041014

Hemank Lamba, Thomas J. Glazier, J. Cámara, B. Schmerl, D. Garlan, J. Pfeffer

{"title":"Model-based Cluster Analysis for Identifying Suspicious Activity Sequences in Software","authors":"Hemank Lamba, Thomas J. Glazier, J. Cámara, B. Schmerl, D. Garlan, J. Pfeffer","doi":"10.1145/3041008.3041014","DOIUrl":null,"url":null,"abstract":"Large software systems have to contend with a significant number of users who interact with different components of the system in various ways. The sequences of components that are used as part of an interaction define sets of behaviors that users have with the system. These can be large in number. Among these users, it is possible that there are some who exhibit anomalous behaviors -- for example, they may have found back doors into the system and are doing something malicious. These anomalous behaviors can be hard to distinguish from normal behavior because of the number of interactions a system may have, or because traces may deviate only slightly from normal behavior. In this paper we describe a model-based approach to cluster sequences of user behaviors within a system and to find suspicious, or anomalous, sequences. We exploit the underlying software architecture of a system to define these sequences. We further show that our approach is better at detecting suspicious activities than other approaches, specifically those that use unigrams and bigrams for anomaly detection. We show this on a simulation of a large scale system based on Amazon Web application style architecture.","PeriodicalId":137012,"journal":{"name":"Proceedings of the 3rd ACM on International Workshop on Security And Privacy Analytics","volume":"94 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 3rd ACM on International Workshop on Security And Privacy Analytics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3041008.3041014","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 7

Abstract

Large software systems have to contend with a significant number of users who interact with different components of the system in various ways. The sequences of components that are used as part of an interaction define sets of behaviors that users have with the system. These can be large in number. Among these users, it is possible that there are some who exhibit anomalous behaviors -- for example, they may have found back doors into the system and are doing something malicious. These anomalous behaviors can be hard to distinguish from normal behavior because of the number of interactions a system may have, or because traces may deviate only slightly from normal behavior. In this paper we describe a model-based approach to cluster sequences of user behaviors within a system and to find suspicious, or anomalous, sequences. We exploit the underlying software architecture of a system to define these sequences. We further show that our approach is better at detecting suspicious activities than other approaches, specifically those that use unigrams and bigrams for anomaly detection. We show this on a simulation of a large scale system based on Amazon Web application style architecture.

查看原文本刊更多论文

基于模型的聚类分析识别软件中可疑活动序列

大型软件系统必须应对大量用户，这些用户以各种方式与系统的不同组件进行交互。作为交互的一部分使用的组件序列定义了用户对系统的行为集。它们的数量可能很大。在这些用户中，可能有一些人表现出异常行为——例如，他们可能发现了进入系统的后门，并正在做一些恶意的事情。这些异常行为很难从正常行为中区分出来，因为系统可能有很多相互作用，或者因为痕迹可能只与正常行为有轻微的偏离。在本文中，我们描述了一种基于模型的方法来聚类系统内的用户行为序列，并发现可疑或异常的序列。我们利用系统的底层软件架构来定义这些序列。我们进一步表明，我们的方法在检测可疑活动方面比其他方法更好，特别是那些使用一元图和双元图进行异常检测的方法。我们在一个基于Amazon Web应用程序风格架构的大型系统的模拟中展示了这一点。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 3rd ACM on International Workshop on Security And Privacy Analytics

自引率

0.00%

发文量