Server-Side Prediction of Source IP Addresses Using Density Estimation

2009 International Conference on Availability, Reliability and Security Pub Date : 2009-03-16 DOI:10.1109/ARES.2009.113

Markus Goldstein, Matthias Reif, A. Stahl, T. Breuel

{"title":"Server-Side Prediction of Source IP Addresses Using Density Estimation","authors":"Markus Goldstein, Matthias Reif, A. Stahl, T. Breuel","doi":"10.1109/ARES.2009.113","DOIUrl":null,"url":null,"abstract":"Source IP addresses are often used as a major feature for user modeling in computer networks. Particularly in the field of Distributed Denial of Service (DDoS) attack detection and mitigation traffic models make extensive use of source IP addresses for detecting anomalies. Typically the real IP address distribution is strongly undersampled due to a small amount of observations. Density estimation overcomes this shortage by taking advantage of IP neighborhood relations. In many cases simple models are implicitly used or chosen intuitively as a network based heuristic. In this paper we review and formalize existing models including a hierarchical clustering approach first. In addition, we present a modified k-means clustering algorithm for source IP density estimation as well as a statistical motivated smoothing approach using the Nadaraya-Watson kernel-weighted average. For performance evaluation we apply all methods on a 90 days real world dataset consisting of 1.3 million different source IP addresses and try to predict the users of the following next 10 days. ROC curves and an example DDoS mitigation scenario show that there is no uniformly better approach: k-means performs best when a high detection rate is needed whereas statistical smoothing works better for low false alarm rate requirements like the DDoS mitigation scenario.","PeriodicalId":169468,"journal":{"name":"2009 International Conference on Availability, Reliability and Security","volume":"239 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 International Conference on Availability, Reliability and Security","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ARES.2009.113","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 8

Abstract

Source IP addresses are often used as a major feature for user modeling in computer networks. Particularly in the field of Distributed Denial of Service (DDoS) attack detection and mitigation traffic models make extensive use of source IP addresses for detecting anomalies. Typically the real IP address distribution is strongly undersampled due to a small amount of observations. Density estimation overcomes this shortage by taking advantage of IP neighborhood relations. In many cases simple models are implicitly used or chosen intuitively as a network based heuristic. In this paper we review and formalize existing models including a hierarchical clustering approach first. In addition, we present a modified k-means clustering algorithm for source IP density estimation as well as a statistical motivated smoothing approach using the Nadaraya-Watson kernel-weighted average. For performance evaluation we apply all methods on a 90 days real world dataset consisting of 1.3 million different source IP addresses and try to predict the users of the following next 10 days. ROC curves and an example DDoS mitigation scenario show that there is no uniformly better approach: k-means performs best when a high detection rate is needed whereas statistical smoothing works better for low false alarm rate requirements like the DDoS mitigation scenario.

查看原文本刊更多论文

使用密度估计源IP地址的服务器端预测

在计算机网络中，源IP地址经常被用作用户建模的主要特征。特别是在分布式拒绝服务(DDoS)攻击检测和缓解流量模型广泛使用源IP地址来检测异常。通常情况下，真实的IP地址分布由于观测量少而严重欠采样。密度估计利用IP邻域关系克服了这一不足。在许多情况下，简单的模型被隐式地使用或直观地选择为基于网络的启发式。在本文中，我们首先回顾和形式化现有的模型，包括层次聚类方法。此外，我们提出了一种改进的k-means聚类算法用于源IP密度估计，以及使用Nadaraya-Watson核加权平均的统计动机平滑方法。为了进行性能评估，我们将所有方法应用于由130万个不同源IP地址组成的90天真实世界数据集，并尝试预测接下来10天的用户。ROC曲线和一个示例DDoS缓解场景表明，没有统一的更好的方法:k-means在需要高检测率时表现最好，而统计平滑在像DDoS缓解场景这样的低误报率要求下效果更好。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2009 International Conference on Availability, Reliability and Security

自引率

0.00%

发文量