Classifying Web Exploits with Topic Modeling

2017 28th International Workshop on Database and Expert Systems Applications (DEXA) Pub Date : 2017-08-01 DOI:10.1109/DEXA.2017.35

Jukka Ruohonen

引用次数: 13

Abstract

This short empirical paper investigates how well topic modeling and database meta-data characteristics can classify web and other proof-of-concept (PoC) exploits for publicly disclosed software vulnerabilities. By using a dataset comprised of over 36 thousand PoC exploits, near a 0.9 accuracy rate is obtained in the empirical experiment. Text mining and topic modeling are a significant boost factor behind this classification performance. In addition to these empirical results, the paper contributes to the research tradition of enhancing software vulnerability information with text mining, providing also a few scholarly observations about the potential for semi-automatic classification of exploits in the existing tracking infrastructures.

查看原文本刊更多论文

利用主题建模对Web漏洞进行分类

这篇简短的实证论文研究了主题建模和数据库元数据特征如何很好地分类网络和其他概念验证(PoC)利用公开披露的软件漏洞。通过使用由超过3.6万个PoC漏洞组成的数据集，在经验实验中获得了接近0.9的准确率。文本挖掘和主题建模是这种分类性能背后的重要提升因素。除了这些实证结果之外，本文还对利用文本挖掘增强软件漏洞信息的研究传统做出了贡献，并提供了一些关于现有跟踪基础设施中漏洞半自动分类潜力的学术观察。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2017 28th International Workshop on Database and Expert Systems Applications (DEXA)

自引率

0.00%

发文量