ProbeGeo: A Comprehensive Landmark Mining Framework Based on Web Content

IF 3 3区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE
Jinlei Lin;Chenglong Li;Wenwen Gong;Guanglei Song;Linna Fan;Zhiliang Wang;Jiahai Yang
{"title":"ProbeGeo: A Comprehensive Landmark Mining Framework Based on Web Content","authors":"Jinlei Lin;Chenglong Li;Wenwen Gong;Guanglei Song;Linna Fan;Zhiliang Wang;Jiahai Yang","doi":"10.1109/TNET.2024.3422089","DOIUrl":null,"url":null,"abstract":"IP geolocation is essential for various location-aware Internet applications. High-quality IP geolocation landmarks play a decisive role in IP geolocation accuracy. However, the previous research works focusing on mining landmarks from the Internet are hampered by limited quantity, poor coverage, and insufficient landmark quality. In this paper, we present a new framework called ProbeGeo to mine high-quality landmarks automatically. We divide landmarks into common landmarks and probe landmarks, providing systematic mining methods based on online retrieval and web content. ProbeGeo expands traditional common landmarks by taking advantage of the exposure of multiple IoT (Internet of Things) devices on the Internet, mining them based on search engines and webpage contents. Common landmarks, consisting of multi-type devices, significantly improve landmark quantity and coverage. Furthermore, ProbeGeo establishes a methodology for acquiring new probe landmarks from Internet VPs (Vantage Points) webpages, extracting geographical locations from heterogeneous webpages and utilizing active probe functions. Probe landmarks enhance landmark quality and functions, bringing new geolocation frameworks and breaking through the geolocation accuracy bottleneck. We develop the ProbeGeo as a continuously running system and conduct real-world experiments to validate its efficacy. Our results show that ProbeGeo can detect 89,849 high-quality landmarks, including 6,874 probe landmarks and 82,975 common landmarks. ProbeGeo landmarks are about 10x more than existing work, distributed in 181 countries and 7,094 cities. ProbeGeo landmarks cover more than 8 types of devices, and more than 60% of them remain stable over one month. Moreover, the landmark accuracy of more than 58% of ProbeGeo landmarks is above street level, which has not been achieved in previous works. ProbeGeo can provide geolocation services with higher landmark accuracy and broader coverage by correlating a large scale of landmarks.","PeriodicalId":13443,"journal":{"name":"IEEE/ACM Transactions on Networking","volume":"32 5","pages":"4398-4413"},"PeriodicalIF":3.0000,"publicationDate":"2024-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE/ACM Transactions on Networking","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10615999/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

Abstract

IP geolocation is essential for various location-aware Internet applications. High-quality IP geolocation landmarks play a decisive role in IP geolocation accuracy. However, the previous research works focusing on mining landmarks from the Internet are hampered by limited quantity, poor coverage, and insufficient landmark quality. In this paper, we present a new framework called ProbeGeo to mine high-quality landmarks automatically. We divide landmarks into common landmarks and probe landmarks, providing systematic mining methods based on online retrieval and web content. ProbeGeo expands traditional common landmarks by taking advantage of the exposure of multiple IoT (Internet of Things) devices on the Internet, mining them based on search engines and webpage contents. Common landmarks, consisting of multi-type devices, significantly improve landmark quantity and coverage. Furthermore, ProbeGeo establishes a methodology for acquiring new probe landmarks from Internet VPs (Vantage Points) webpages, extracting geographical locations from heterogeneous webpages and utilizing active probe functions. Probe landmarks enhance landmark quality and functions, bringing new geolocation frameworks and breaking through the geolocation accuracy bottleneck. We develop the ProbeGeo as a continuously running system and conduct real-world experiments to validate its efficacy. Our results show that ProbeGeo can detect 89,849 high-quality landmarks, including 6,874 probe landmarks and 82,975 common landmarks. ProbeGeo landmarks are about 10x more than existing work, distributed in 181 countries and 7,094 cities. ProbeGeo landmarks cover more than 8 types of devices, and more than 60% of them remain stable over one month. Moreover, the landmark accuracy of more than 58% of ProbeGeo landmarks is above street level, which has not been achieved in previous works. ProbeGeo can provide geolocation services with higher landmark accuracy and broader coverage by correlating a large scale of landmarks.
ProbeGeo:基于网络内容的综合地标挖掘框架
IP 地理定位对各种位置感知互联网应用至关重要。高质量的 IP 地理定位地标对 IP 地理定位的准确性起着决定性作用。然而,以往专注于从互联网中挖掘地标的研究工作因数量有限、覆盖范围小和地标质量不高而受到阻碍。在本文中,我们提出了一个名为 ProbeGeo 的新框架,用于自动挖掘高质量地标。我们将地标分为普通地标和探测地标,提供了基于在线检索和网络内容的系统挖掘方法。ProbeGeo 利用互联网上多个物联网(IoT)设备的暴露优势,基于搜索引擎和网页内容挖掘传统的普通地标,从而扩展了普通地标。由多种类型设备组成的通用地标大大提高了地标数量和覆盖范围。此外,ProbeGeo 还建立了一种从互联网 VPs(有利位置)网页中获取新探测地标的方法,从异构网页中提取地理位置并利用主动探测功能。探测地标可提高地标质量和功能,带来新的地理定位框架,突破地理定位精度瓶颈。我们开发的 ProbeGeo 是一个持续运行的系统,并进行了实际实验来验证其功效。结果表明,ProbeGeo 可以检测到 89,849 个高质量地标,其中包括 6,874 个探测地标和 82,975 个普通地标。ProbeGeo 的地标数量是现有成果的 10 倍,分布在 181 个国家和 7094 个城市。ProbeGeo 地标涵盖 8 种以上的设备,其中 60% 以上的设备在一个月内保持稳定。此外,超过 58% 的 ProbeGeo 地标精确度高于街道水平,这是以往的工作所没有达到的。ProbeGeo 可以通过关联大规模地标,提供地标精度更高、覆盖范围更广的地理定位服务。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
IEEE/ACM Transactions on Networking
IEEE/ACM Transactions on Networking 工程技术-电信学
CiteScore
8.20
自引率
5.40%
发文量
246
审稿时长
4-8 weeks
期刊介绍: The IEEE/ACM Transactions on Networking’s high-level objective is to publish high-quality, original research results derived from theoretical or experimental exploration of the area of communication/computer networking, covering all sorts of information transport networks over all sorts of physical layer technologies, both wireline (all kinds of guided media: e.g., copper, optical) and wireless (e.g., radio-frequency, acoustic (e.g., underwater), infra-red), or hybrids of these. The journal welcomes applied contributions reporting on novel experiences and experiments with actual systems.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信