{"title":"Retrieving address-based locations from the web","authors":"Dirk Ahlers, Susanne CJ Boll","doi":"10.1145/1460007.1460015","DOIUrl":null,"url":null,"abstract":"Geospatial search for the Web determines the relation of documents' contents to a location within a region. For some pedestrian scenarios, information at a higher granularity down to individual buildings is necessary. In this paper, we describe a process for the extraction and simultaneous verification of precise addresses on German Web pages by a validating parser. We describe how an address-level location extraction can be aided by an extensive use of previous geographic knowledge and the use of its structure. The analysis of address structure, components and dependencies leads to the design of a geoparser that determines valid addresses within unstructured Web content. We further discuss some noteworthy issues that arise within the process.","PeriodicalId":167948,"journal":{"name":"Workshop on Geographic Information Retrieval","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"37","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Workshop on Geographic Information Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1460007.1460015","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 37
Abstract
Geospatial search for the Web determines the relation of documents' contents to a location within a region. For some pedestrian scenarios, information at a higher granularity down to individual buildings is necessary. In this paper, we describe a process for the extraction and simultaneous verification of precise addresses on German Web pages by a validating parser. We describe how an address-level location extraction can be aided by an extensive use of previous geographic knowledge and the use of its structure. The analysis of address structure, components and dependencies leads to the design of a geoparser that determines valid addresses within unstructured Web content. We further discuss some noteworthy issues that arise within the process.