{"title":"An IP Geolocation Database Evaluation and Fusion Model Based on Data Correlation and Delay Similarity","authors":"Xie Bo, Li Han, Wang Yong","doi":"10.1145/3291842.3291876","DOIUrl":null,"url":null,"abstract":"IP geolocation database is widely used in many Internet services. At present, there are many inaccurate or missing geolocations in IP geolocation databases. However, the industry lacks an effective method to evaluate them. Based on the assumption that the majority of entries in well-known databases are correct and delay measurement, this paper proposed an IP geolocation database evaluation and fusion model based on data correlation and delay similarity. Firstly, we improved the previous evaluation model based on data-consistency-rate by introducing geolocation coverage rate at different granularities. Secondly, by measuring the delays of IP addresses at large scale, the standard delay of a geographical city is determined, then, we calculated the delay similarity rate of different databases between IP's own delay and its geolocation city's standard delay. Thirdly, we used the weighted voting method to fuse inconsistent geolocations among databases, where the vote share is determined by the improved data-consistency-rate and delay similarity rate, and presented a sole fusion database. Finally, we took 340 million IP addresses allocated to mainland China as an example, compared with the existing model, the accuracy of the model we proposed is increased by 8.79%.","PeriodicalId":283197,"journal":{"name":"Proceedings of the 2nd International Conference on Telecommunications and Communication Engineering","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2nd International Conference on Telecommunications and Communication Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3291842.3291876","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
IP geolocation database is widely used in many Internet services. At present, there are many inaccurate or missing geolocations in IP geolocation databases. However, the industry lacks an effective method to evaluate them. Based on the assumption that the majority of entries in well-known databases are correct and delay measurement, this paper proposed an IP geolocation database evaluation and fusion model based on data correlation and delay similarity. Firstly, we improved the previous evaluation model based on data-consistency-rate by introducing geolocation coverage rate at different granularities. Secondly, by measuring the delays of IP addresses at large scale, the standard delay of a geographical city is determined, then, we calculated the delay similarity rate of different databases between IP's own delay and its geolocation city's standard delay. Thirdly, we used the weighted voting method to fuse inconsistent geolocations among databases, where the vote share is determined by the improved data-consistency-rate and delay similarity rate, and presented a sole fusion database. Finally, we took 340 million IP addresses allocated to mainland China as an example, compared with the existing model, the accuracy of the model we proposed is increased by 8.79%.