A Graph Neural Network Approach for Scalable and Dynamic IP Similarity in Enterprise Networks

2020 IEEE 9th International Conference on Cloud Networking (CloudNet) Pub Date : 2020-10-09 DOI:10.1109/CloudNet51028.2020.9335789

Hazem M. Soliman, Geoffrey Salmon, Dusan Sovilj, M. Rao

{"title":"A Graph Neural Network Approach for Scalable and Dynamic IP Similarity in Enterprise Networks","authors":"Hazem M. Soliman, Geoffrey Salmon, Dusan Sovilj, M. Rao","doi":"10.1109/CloudNet51028.2020.9335789","DOIUrl":null,"url":null,"abstract":"Measuring similarity between IP addresses is an important task in the daily operations of any enterprise network. Applications that depend on an IP similarity measure include measuring correlation between security alerts, building baselines for behavioral modelling, debugging network failures and tracking persistent attacks. However, IPs do not have a natural similarity measure by definition. Deep Learning architectures are a promising solution here since they are able to learn numerical representations for IPs directly from data, allowing various distance measures to be applied on the calculated representations. Current works have utilized Natural Language Processing (NLP) techniques for learning IP embeddings. However, these approaches have no systematic way to handle out-of-vocabulary (OOV) IPs not seen during training. In this paper, we propose a novel approach for IP embedding using an adapted graph neural network (GNN) architecture. This approach has the advantages of working on the raw data, scalability and, most importantly, induction, i.e. the ability to measure similarity between previously unseen IPs. Using data from an enterprise network, our approach is able to identify high similarities between local DNS servers and root DNS servers even though some of these machines are never encountered during the training phase.","PeriodicalId":156419,"journal":{"name":"2020 IEEE 9th International Conference on Cloud Networking (CloudNet)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 9th International Conference on Cloud Networking (CloudNet)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CloudNet51028.2020.9335789","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

Abstract

Measuring similarity between IP addresses is an important task in the daily operations of any enterprise network. Applications that depend on an IP similarity measure include measuring correlation between security alerts, building baselines for behavioral modelling, debugging network failures and tracking persistent attacks. However, IPs do not have a natural similarity measure by definition. Deep Learning architectures are a promising solution here since they are able to learn numerical representations for IPs directly from data, allowing various distance measures to be applied on the calculated representations. Current works have utilized Natural Language Processing (NLP) techniques for learning IP embeddings. However, these approaches have no systematic way to handle out-of-vocabulary (OOV) IPs not seen during training. In this paper, we propose a novel approach for IP embedding using an adapted graph neural network (GNN) architecture. This approach has the advantages of working on the raw data, scalability and, most importantly, induction, i.e. the ability to measure similarity between previously unseen IPs. Using data from an enterprise network, our approach is able to identify high similarities between local DNS servers and root DNS servers even though some of these machines are never encountered during the training phase.

查看原文本刊更多论文

企业网络中可扩展和动态IP相似度的图神经网络方法

测量IP地址之间的相似度是任何企业网络日常运营中的一项重要任务。依赖于IP相似性度量的应用程序包括测量安全警报之间的相关性、为行为建模构建基线、调试网络故障和跟踪持续攻击。然而，ip在定义上并没有自然的相似性度量。深度学习架构是一个很有前途的解决方案，因为它们能够直接从数据中学习ip的数值表示，允许在计算的表示上应用各种距离度量。目前的工作是利用自然语言处理(NLP)技术来学习IP嵌入。然而，这些方法没有系统的方法来处理训练期间没有看到的词汇表外ip。本文提出了一种基于自适应图神经网络(GNN)架构的IP嵌入新方法。这种方法在处理原始数据、可扩展性以及最重要的归纳方面具有优势，即能够测量以前未见过的ip之间的相似性。使用来自企业网络的数据，我们的方法能够识别本地DNS服务器和根DNS服务器之间的高度相似性，即使其中一些机器在训练阶段从未遇到过。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2020 IEEE 9th International Conference on Cloud Networking (CloudNet)

自引率

0.00%

发文量