Wen Yang, Meijuan Yin, Xiaonan Liu, Can Wang, Shunran Duan
{"title":"基于机器学习的多源地标融合","authors":"Wen Yang, Meijuan Yin, Xiaonan Liu, Can Wang, Shunran Duan","doi":"10.1145/3371676.3371694","DOIUrl":null,"url":null,"abstract":"Network entity landmark is the key foundation of IP geolocaiton which plays an important role in network security. Integrating multi-source landmarks to generate a landmark database with high IP coverage and high location accuracy is an important solution for improving the IP geolocation effect. However, the city-level landmarks provide the city name while street-level landmarks provide the latitude and longitude where they located. Owning to their inconsistent format, the state-of-art fusion algorithm cannot effectively integrate the two types of data. Hence, this paper proposes Lusion, a multi-source landmark fusion algorithm. We first extend the IP addresses in the landmark data sources, then model the location data of the two types of landmarks using the landmark location mixture model, and finally use the expectation -maximization algorithm to estimate the location of the landmarks. The simulation experiments on 25 landmark data sources show that the algorithm can effectively integrate the city-level and street-level landmarks from different data sources, and have a significantly better performance than the original data sources in the location accuracy. Furthermore, we evaluate Lusion on real-world datasets, which consists of 7 city-level and 3 street-level landmark data sources, by locating 100 IP addresses in Hong Kong and Zhengzhou respectively. The geolocation results show that Lusion increased the city-level accuracy by at least 8 percentage points compared with the original data sources, and reduces the geolocation error from 3.42 km to 2.92 km based on the best original landmark data set.","PeriodicalId":352443,"journal":{"name":"Proceedings of the 2019 9th International Conference on Communication and Network Security","volume":"127 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multi-source Landmark Fusion based on Machine Learning\",\"authors\":\"Wen Yang, Meijuan Yin, Xiaonan Liu, Can Wang, Shunran Duan\",\"doi\":\"10.1145/3371676.3371694\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Network entity landmark is the key foundation of IP geolocaiton which plays an important role in network security. Integrating multi-source landmarks to generate a landmark database with high IP coverage and high location accuracy is an important solution for improving the IP geolocation effect. However, the city-level landmarks provide the city name while street-level landmarks provide the latitude and longitude where they located. Owning to their inconsistent format, the state-of-art fusion algorithm cannot effectively integrate the two types of data. Hence, this paper proposes Lusion, a multi-source landmark fusion algorithm. We first extend the IP addresses in the landmark data sources, then model the location data of the two types of landmarks using the landmark location mixture model, and finally use the expectation -maximization algorithm to estimate the location of the landmarks. The simulation experiments on 25 landmark data sources show that the algorithm can effectively integrate the city-level and street-level landmarks from different data sources, and have a significantly better performance than the original data sources in the location accuracy. Furthermore, we evaluate Lusion on real-world datasets, which consists of 7 city-level and 3 street-level landmark data sources, by locating 100 IP addresses in Hong Kong and Zhengzhou respectively. The geolocation results show that Lusion increased the city-level accuracy by at least 8 percentage points compared with the original data sources, and reduces the geolocation error from 3.42 km to 2.92 km based on the best original landmark data set.\",\"PeriodicalId\":352443,\"journal\":{\"name\":\"Proceedings of the 2019 9th International Conference on Communication and Network Security\",\"volume\":\"127 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-11-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2019 9th International Conference on Communication and Network Security\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3371676.3371694\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2019 9th International Conference on Communication and Network Security","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3371676.3371694","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Multi-source Landmark Fusion based on Machine Learning
Network entity landmark is the key foundation of IP geolocaiton which plays an important role in network security. Integrating multi-source landmarks to generate a landmark database with high IP coverage and high location accuracy is an important solution for improving the IP geolocation effect. However, the city-level landmarks provide the city name while street-level landmarks provide the latitude and longitude where they located. Owning to their inconsistent format, the state-of-art fusion algorithm cannot effectively integrate the two types of data. Hence, this paper proposes Lusion, a multi-source landmark fusion algorithm. We first extend the IP addresses in the landmark data sources, then model the location data of the two types of landmarks using the landmark location mixture model, and finally use the expectation -maximization algorithm to estimate the location of the landmarks. The simulation experiments on 25 landmark data sources show that the algorithm can effectively integrate the city-level and street-level landmarks from different data sources, and have a significantly better performance than the original data sources in the location accuracy. Furthermore, we evaluate Lusion on real-world datasets, which consists of 7 city-level and 3 street-level landmark data sources, by locating 100 IP addresses in Hong Kong and Zhengzhou respectively. The geolocation results show that Lusion increased the city-level accuracy by at least 8 percentage points compared with the original data sources, and reduces the geolocation error from 3.42 km to 2.92 km based on the best original landmark data set.