Hazem M. Soliman, Geoffrey Salmon, Dusan Sovilj, M. Rao
{"title":"企业网络中可扩展和动态IP相似度的图神经网络方法","authors":"Hazem M. Soliman, Geoffrey Salmon, Dusan Sovilj, M. Rao","doi":"10.1109/CloudNet51028.2020.9335789","DOIUrl":null,"url":null,"abstract":"Measuring similarity between IP addresses is an important task in the daily operations of any enterprise network. Applications that depend on an IP similarity measure include measuring correlation between security alerts, building baselines for behavioral modelling, debugging network failures and tracking persistent attacks. However, IPs do not have a natural similarity measure by definition. Deep Learning architectures are a promising solution here since they are able to learn numerical representations for IPs directly from data, allowing various distance measures to be applied on the calculated representations. Current works have utilized Natural Language Processing (NLP) techniques for learning IP embeddings. However, these approaches have no systematic way to handle out-of-vocabulary (OOV) IPs not seen during training. In this paper, we propose a novel approach for IP embedding using an adapted graph neural network (GNN) architecture. This approach has the advantages of working on the raw data, scalability and, most importantly, induction, i.e. the ability to measure similarity between previously unseen IPs. Using data from an enterprise network, our approach is able to identify high similarities between local DNS servers and root DNS servers even though some of these machines are never encountered during the training phase.","PeriodicalId":156419,"journal":{"name":"2020 IEEE 9th International Conference on Cloud Networking (CloudNet)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"A Graph Neural Network Approach for Scalable and Dynamic IP Similarity in Enterprise Networks\",\"authors\":\"Hazem M. Soliman, Geoffrey Salmon, Dusan Sovilj, M. Rao\",\"doi\":\"10.1109/CloudNet51028.2020.9335789\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Measuring similarity between IP addresses is an important task in the daily operations of any enterprise network. Applications that depend on an IP similarity measure include measuring correlation between security alerts, building baselines for behavioral modelling, debugging network failures and tracking persistent attacks. However, IPs do not have a natural similarity measure by definition. Deep Learning architectures are a promising solution here since they are able to learn numerical representations for IPs directly from data, allowing various distance measures to be applied on the calculated representations. Current works have utilized Natural Language Processing (NLP) techniques for learning IP embeddings. However, these approaches have no systematic way to handle out-of-vocabulary (OOV) IPs not seen during training. In this paper, we propose a novel approach for IP embedding using an adapted graph neural network (GNN) architecture. This approach has the advantages of working on the raw data, scalability and, most importantly, induction, i.e. the ability to measure similarity between previously unseen IPs. Using data from an enterprise network, our approach is able to identify high similarities between local DNS servers and root DNS servers even though some of these machines are never encountered during the training phase.\",\"PeriodicalId\":156419,\"journal\":{\"name\":\"2020 IEEE 9th International Conference on Cloud Networking (CloudNet)\",\"volume\":\"33 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-10-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE 9th International Conference on Cloud Networking (CloudNet)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CloudNet51028.2020.9335789\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 9th International Conference on Cloud Networking (CloudNet)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CloudNet51028.2020.9335789","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Graph Neural Network Approach for Scalable and Dynamic IP Similarity in Enterprise Networks
Measuring similarity between IP addresses is an important task in the daily operations of any enterprise network. Applications that depend on an IP similarity measure include measuring correlation between security alerts, building baselines for behavioral modelling, debugging network failures and tracking persistent attacks. However, IPs do not have a natural similarity measure by definition. Deep Learning architectures are a promising solution here since they are able to learn numerical representations for IPs directly from data, allowing various distance measures to be applied on the calculated representations. Current works have utilized Natural Language Processing (NLP) techniques for learning IP embeddings. However, these approaches have no systematic way to handle out-of-vocabulary (OOV) IPs not seen during training. In this paper, we propose a novel approach for IP embedding using an adapted graph neural network (GNN) architecture. This approach has the advantages of working on the raw data, scalability and, most importantly, induction, i.e. the ability to measure similarity between previously unseen IPs. Using data from an enterprise network, our approach is able to identify high similarities between local DNS servers and root DNS servers even though some of these machines are never encountered during the training phase.