{"title":"Itinerary retrieval: travelers, like traveling salesmen, prefer efficient routes","authors":"M. Adelfio, H. Samet","doi":"10.1145/2675354.2675355","DOIUrl":null,"url":null,"abstract":"Internet users share large quantities of text and multimedia content that becomes easily accessible to others via hyperlinks and search engine results. However, structured datasets generally lack this level of exposure. One example is the travel itinerary, which many Internet users post online in the form of a spreadsheet or web page table, yet the collection of such itineraries remains difficult to search or browse due to insufficient parsing and indexing by search engines. Enabling interaction with user-uploaded itineraries could provide valuable information to trip planners who are researching travel options and to businesses attempting to understand travel patterns. This work examines the challenges of identifying and extracting itineraries from spreadsheets and web page tables to support such applications, with a focus on differentiating between itineraries and other documents with geographic content.","PeriodicalId":286892,"journal":{"name":"Proceedings of the 8th Workshop on Geographic Information Retrieval","volume":"84 11","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 8th Workshop on Geographic Information Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2675354.2675355","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12
Abstract
Internet users share large quantities of text and multimedia content that becomes easily accessible to others via hyperlinks and search engine results. However, structured datasets generally lack this level of exposure. One example is the travel itinerary, which many Internet users post online in the form of a spreadsheet or web page table, yet the collection of such itineraries remains difficult to search or browse due to insufficient parsing and indexing by search engines. Enabling interaction with user-uploaded itineraries could provide valuable information to trip planners who are researching travel options and to businesses attempting to understand travel patterns. This work examines the challenges of identifying and extracting itineraries from spreadsheets and web page tables to support such applications, with a focus on differentiating between itineraries and other documents with geographic content.