Proceedings of the 8th ACM Workshop on Artificial Intelligence and Security最新文献

Thwarting Fake OSN Accounts by Predicting their Victims 通过预测受害者来阻止虚假的OSN账户

Proceedings of the 8th ACM Workshop on Artificial Intelligence and Security Pub Date : 2015-10-16 DOI: 10.1145/2808769.2808772

Yazan Boshmaf, M. Ripeanu, K. Beznosov, E. Santos-Neto

引用次数: 21

Detecting Clusters of Fake Accounts in Online Social Networks 在线社交网络中虚假账户集群的检测

Proceedings of the 8th ACM Workshop on Artificial Intelligence and Security Pub Date : 2015-10-16 DOI: 10.1145/2808769.2808779

Cao Xiao, D. Freeman, Theodore Hwa

{"title":"Detecting Clusters of Fake Accounts in Online Social Networks","authors":"Cao Xiao, D. Freeman, Theodore Hwa","doi":"10.1145/2808769.2808779","DOIUrl":"https://doi.org/10.1145/2808769.2808779","url":null,"abstract":"Fake accounts are a preferred means for malicious users of online social networks to send spam, commit fraud, or otherwise abuse the system. A single malicious actor may create dozens to thousands of fake accounts in order to scale their operation to reach the maximum number of legitimate members. Detecting and taking action on these accounts as quickly as possible is imperative in order to protect legitimate members and maintain the trustworthiness of the network. However, any individual fake account may appear to be legitimate on first inspection, for example by having a real-sounding name or a believable profile. In this work we describe a scalable approach to finding groups of fake accounts registered by the same actor. The main technique is a supervised machine learning pipeline for classifying {em an entire cluster} of accounts as malicious or legitimate. The key features used in the model are statistics on fields of user-generated text such as name, email address, company or university; these include both frequencies of patterns {em within} the cluster (e.g., do all of the emails share a common letter/digit pattern) and comparison of text frequencies across the entire user base (e.g., are all of the names rare?). We apply our framework to analyze account data on LinkedIn grouped by registration IP address and registration date. Our model achieved AUC 0.98 on a held-out test set and AUC 0.95 on out-of-sample testing data. The model has been productionalized and has identified more than 250,000 fake accounts since deployment.","PeriodicalId":426614,"journal":{"name":"Proceedings of the 8th ACM Workshop on Artificial Intelligence and Security","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125491559","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 169

Better Malware Ground Truth: Techniques for Weighting Anti-Virus Vendor Labels 更好的恶意软件真相:反病毒厂商标签加权技术

Proceedings of the 8th ACM Workshop on Artificial Intelligence and Security Pub Date : 2015-10-16 DOI: 10.1145/2808769.2808780

Alex Kantchelian, Michael Carl Tschantz, Sadia Afroz, Brad Miller, Vaishaal Shankar, Rekha Bachwani, A. Joseph, J. D. Tygar

{"title":"Better Malware Ground Truth: Techniques for Weighting Anti-Virus Vendor Labels","authors":"Alex Kantchelian, Michael Carl Tschantz, Sadia Afroz, Brad Miller, Vaishaal Shankar, Rekha Bachwani, A. Joseph, J. D. Tygar","doi":"10.1145/2808769.2808780","DOIUrl":"https://doi.org/10.1145/2808769.2808780","url":null,"abstract":"We examine the problem of aggregating the results of multiple anti-virus (AV) vendors' detectors into a single authoritative ground-truth label for every binary. To do so, we adapt a well-known generative Bayesian model that postulates the existence of a hidden ground truth upon which the AV labels depend. We use training based on Expectation Maximization for this fully unsupervised technique. We evaluate our method using 279,327 distinct binaries from VirusTotal, each of which appeared for the first time between January 2012 and June 2014. Our evaluation shows that our statistical model is consistently more accurate at predicting the future-derived ground truth than all unweighted rules of the form \"k out of n\" AV detections. In addition, we evaluate the scenario where partial ground truth is available for model building. We train a logistic regression predictor on the partial label information. Our results show that as few as a 100 randomly selected training instances with ground truth are enough to achieve 80% true positive rate for 0.1% false positive rate. In comparison, the best unweighted threshold rule provides only 60% true positive rate at the same false positive rate.","PeriodicalId":426614,"journal":{"name":"Proceedings of the 8th ACM Workshop on Artificial Intelligence and Security","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122677774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 96

Remote Operating System Classification over IPv6 基于IPv6的远程操作系统分类

Proceedings of the 8th ACM Workshop on Artificial Intelligence and Security Pub Date : 2015-10-16 DOI: 10.1145/2808769.2808777

D. Fifield, A. Geana, Luis MartinGarcia, M. Morbitzer, J. D. Tygar

{"title":"Remote Operating System Classification over IPv6","authors":"D. Fifield, A. Geana, Luis MartinGarcia, M. Morbitzer, J. D. Tygar","doi":"10.1145/2808769.2808777","DOIUrl":"https://doi.org/10.1145/2808769.2808777","url":null,"abstract":"Differences in the implementation of common networking protocols make it possible to identify the operating system of a remote host by the characteristics of its TCP and IP packets, even in the absence of application-layer information. This technique, \"OS fingerprinting,\" is relevant to network security because of its relationship to network inventory, vulnerability scanning, and tailoring of exploits. Various techniques of fingerprinting over IPv4 have been in use for over a decade; however IPv6 has had comparatively scant attention in both research and in practical tools. In this paper we describe an IPv6-based OS fingerprinting engine that is based on a linear classifier. It introduces innovative classification features and network probes that take advantage of the specifics of IPv6, while also making use of existing proven techniques. The engine is deployed in Nmap, a widely used network security scanner. This engine provides good performance at a fraction of the maintenance costs of classical signature-based systems. We describe our work in progress to enhance the deployed system: new network probes that help to further distinguish operating systems, and imputation of incomplete feature vectors.","PeriodicalId":426614,"journal":{"name":"Proceedings of the 8th ACM Workshop on Artificial Intelligence and Security","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129516903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

Differential Privacy for Classifier Evaluation 分类器评估的差分隐私

Proceedings of the 8th ACM Workshop on Artificial Intelligence and Security Pub Date : 2015-10-16 DOI: 10.1145/2808769.2808775

Kendrick Boyd, Eric Lantz, David Page

引用次数: 25

Machine Learning for Enterprise Security 企业安全的机器学习

Proceedings of the 8th ACM Workshop on Artificial Intelligence and Security Pub Date : 2015-10-16 DOI: 10.1145/2808769.2808782

P. Manadhata

引用次数: 1

Fast, Privacy Preserving Linear Regression over Distributed Datasets based on Pre-Distributed Data 基于预分布数据的分布式数据集快速、保护隐私的线性回归

Proceedings of the 8th ACM Workshop on Artificial Intelligence and Security Pub Date : 2015-10-16 DOI: 10.1145/2808769.2808774

M. D. Cock, Rafael Dowsley, Anderson C. A. Nascimento, S. Newman

{"title":"Fast, Privacy Preserving Linear Regression over Distributed Datasets based on Pre-Distributed Data","authors":"M. D. Cock, Rafael Dowsley, Anderson C. A. Nascimento, S. Newman","doi":"10.1145/2808769.2808774","DOIUrl":"https://doi.org/10.1145/2808769.2808774","url":null,"abstract":"This work proposes a protocol for performing linear regression over a dataset that is distributed over multiple parties. The parties will jointly compute a linear regression model without actually sharing their own private datasets. We provide security definitions, a protocol, and security proofs. Our solution is information-theoretically secure and is based on the assumption that a Trusted Initializer pre-distributes random, correlated data to the parties during a setup phase. The actual computation happens later on, during an online phase, and does not involve the trusted initializer. Our online protocol is orders of magnitude faster than previous solutions. In the case where a trusted initializer is not available, we propose a computationally secure two-party protocol based on additive homomorphic encryption that substitutes the trusted initializer. In this case, the online phase remains the same and the offline phase is computationally heavy. However, because the computations in the offline phase happen over random data, the overall problem is embarrassingly parallelizable, making it faster than existing solutions for processors with an appropriate number of cores.","PeriodicalId":426614,"journal":{"name":"Proceedings of the 8th ACM Workshop on Artificial Intelligence and Security","volume":"85 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126230491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 80

Automated Attacks on Compression-Based Classifiers 基于压缩分类器的自动攻击

Proceedings of the 8th ACM Workshop on Artificial Intelligence and Security Pub Date : 2015-10-16 DOI: 10.1145/2808769.2808778

Igor Burago, Daniel Lowd

引用次数: 3

Subsampled Exponential Mechanism: Differential Privacy in Large Output Spaces 下采样指数机制:大输出空间中的差分隐私

Proceedings of the 8th ACM Workshop on Artificial Intelligence and Security Pub Date : 2015-10-16 DOI: 10.1145/2808769.2808776

Eric Lantz, Kendrick Boyd, David Page

引用次数: 12

Malicious Behavior Detection using Windows Audit Logs 使用Windows审计日志进行恶意行为检测

Proceedings of the 8th ACM Workshop on Artificial Intelligence and Security Pub Date : 2015-06-13 DOI: 10.1145/2808769.2808773

Konstantin Berlin, David Slater, Joshua Saxe

{"title":"Malicious Behavior Detection using Windows Audit Logs","authors":"Konstantin Berlin, David Slater, Joshua Saxe","doi":"10.1145/2808769.2808773","DOIUrl":"https://doi.org/10.1145/2808769.2808773","url":null,"abstract":"As antivirus and network intrusion detection systems have increasingly proven insufficient to detect advanced threats, large security operations centers have moved to deploy endpoint-based sensors that provide deeper visibility into low-level events across their enterprises. Unfortunately, for many organizations in government and industry, the installation, maintenance, and resource requirements of these newer solutions pose barriers to adoption and are perceived as risks to organizations' missions. To mitigate this problem we investigated the utility of agentless detection of malicious endpoint behavior, using only the standard built-in Windows audit logging facility as our signal. We found that Windows audit logs, while emitting manageable sized data streams on the endpoints, provide enough information to allow robust detection of malicious behavior. Audit logs provide an effective, low-cost alternative to deploying additional expensive agent-based breach detection systems in many government and industrial settings, and can be used to detect, in our tests, 83% percent of malware samples with a 0.1% false positive rate. They can also supplement already existing host signature-based antivirus solutions, like Kaspersky, Symantec, and McAfee, detecting, in our testing environment, 78% of malware missed by those antivirus systems.","PeriodicalId":426614,"journal":{"name":"Proceedings of the 8th ACM Workshop on Artificial Intelligence and Security","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123599722","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 77