Beyond heuristics: learning to classify vulnerabilities and predict exploits

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining Pub Date : 2010-07-25 DOI:10.1145/1835804.1835821

M. Bozorgi, L. Saul, S. Savage, G. Voelker

{"title":"Beyond heuristics: learning to classify vulnerabilities and predict exploits","authors":"M. Bozorgi, L. Saul, S. Savage, G. Voelker","doi":"10.1145/1835804.1835821","DOIUrl":null,"url":null,"abstract":"The security demands on modern system administration are enormous and getting worse. Chief among these demands, administrators must monitor the continual ongoing disclosure of software vulnerabilities that have the potential to compromise their systems in some way. Such vulnerabilities include buffer overflow errors, improperly validated inputs, and other unanticipated attack modalities. In 2008, over 7,400 new vulnerabilities were disclosed--well over 100 per week. While no enterprise is affected by all of these disclosures, administrators commonly face many outstanding vulnerabilities across the software systems they manage. Vulnerabilities can be addressed by patches, reconfigurations, and other workarounds; however, these actions may incur down-time or unforeseen side-effects. Thus, a key question for systems administrators is which vulnerabilities to prioritize. From publicly available databases that document past vulnerabilities, we show how to train classifiers that predict whether and how soon a vulnerability is likely to be exploited. As input, our classifiers operate on high dimensional feature vectors that we extract from the text fields, time stamps, cross references, and other entries in existing vulnerability disclosure reports. Compared to current industry-standard heuristics based on expert knowledge and static formulas, our classifiers predict much more accurately whether and how soon individual vulnerabilities are likely to be exploited.","PeriodicalId":20529,"journal":{"name":"Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining","volume":"06 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2010-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"224","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1835804.1835821","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 224

Abstract

The security demands on modern system administration are enormous and getting worse. Chief among these demands, administrators must monitor the continual ongoing disclosure of software vulnerabilities that have the potential to compromise their systems in some way. Such vulnerabilities include buffer overflow errors, improperly validated inputs, and other unanticipated attack modalities. In 2008, over 7,400 new vulnerabilities were disclosed--well over 100 per week. While no enterprise is affected by all of these disclosures, administrators commonly face many outstanding vulnerabilities across the software systems they manage. Vulnerabilities can be addressed by patches, reconfigurations, and other workarounds; however, these actions may incur down-time or unforeseen side-effects. Thus, a key question for systems administrators is which vulnerabilities to prioritize. From publicly available databases that document past vulnerabilities, we show how to train classifiers that predict whether and how soon a vulnerability is likely to be exploited. As input, our classifiers operate on high dimensional feature vectors that we extract from the text fields, time stamps, cross references, and other entries in existing vulnerability disclosure reports. Compared to current industry-standard heuristics based on expert knowledge and static formulas, our classifiers predict much more accurately whether and how soon individual vulnerabilities are likely to be exploited.

查看原文本刊更多论文

超越启发式:学习对漏洞进行分类并预测攻击

现代系统管理对安全性的要求是巨大的，而且越来越高。在这些要求中，最主要的是，管理员必须监视软件漏洞的持续披露，这些漏洞可能以某种方式危害他们的系统。这些漏洞包括缓冲区溢出错误、未正确验证的输入和其他未预料到的攻击方式。2008年，超过7400个新漏洞被披露——每周超过100个。虽然没有企业受到所有这些披露的影响，但管理员通常会在他们管理的软件系统中面临许多突出的漏洞。漏洞可以通过补丁、重新配置和其他变通方法来解决;然而，这些操作可能会导致停机或不可预见的副作用。因此，系统管理员面临的一个关键问题是要优先考虑哪些漏洞。从记录过去漏洞的公开可用数据库中，我们展示了如何训练分类器，以预测漏洞是否可能被利用以及多久可能被利用。作为输入，我们的分类器对从文本字段、时间戳、交叉引用和现有漏洞披露报告中的其他条目中提取的高维特征向量进行操作。与目前基于专家知识和静态公式的行业标准启发式方法相比，我们的分类器可以更准确地预测单个漏洞是否可能被利用以及多久被利用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining

自引率

0.00%

发文量