Mapping Historical Documents to Geographical Space

T. Hirayama, Hidetsugu Nanba, T. Takezawa
{"title":"Mapping Historical Documents to Geographical Space","authors":"T. Hirayama, Hidetsugu Nanba, T. Takezawa","doi":"10.1145/3004010.3004028","DOIUrl":null,"url":null,"abstract":"Geotagging is the process of recognizing place and facility names in a document, and assigning each set of latitude and longitude values. In the latter step, an external geographic database, which contains pairs of place/facility names and latitude/longitude values, is used. However, if former place/facility names are used in a historical document, it is impossible to assign latitude and longitude values to them, even though their current names are listed in the database. Furthermore, if there are multiple identical place/facility names in the geographical database, we will have to choose the correct one. In this paper, we propose a method to construct a database that contains current and former place/facility name pairs. We applied a machine learning-based information extraction method to some text corpora, and automatically extracted current and former place/facility name pairs. We also propose a method that disambiguates the same place/facility names. We conducted some experiments to confirm the effectiveness of our method.","PeriodicalId":406787,"journal":{"name":"Adjunct Proceedings of the 13th International Conference on Mobile and Ubiquitous Systems: Computing Networking and Services","volume":"131 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Adjunct Proceedings of the 13th International Conference on Mobile and Ubiquitous Systems: Computing Networking and Services","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3004010.3004028","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Geotagging is the process of recognizing place and facility names in a document, and assigning each set of latitude and longitude values. In the latter step, an external geographic database, which contains pairs of place/facility names and latitude/longitude values, is used. However, if former place/facility names are used in a historical document, it is impossible to assign latitude and longitude values to them, even though their current names are listed in the database. Furthermore, if there are multiple identical place/facility names in the geographical database, we will have to choose the correct one. In this paper, we propose a method to construct a database that contains current and former place/facility name pairs. We applied a machine learning-based information extraction method to some text corpora, and automatically extracted current and former place/facility name pairs. We also propose a method that disambiguates the same place/facility names. We conducted some experiments to confirm the effectiveness of our method.
将历史文献映射到地理空间
地理标记是在文档中识别地点和设施名称,并分配每一组纬度和经度值的过程。在后一步中,使用外部地理数据库,其中包含成对的地点/设施名称和纬度/经度值。但是,如果在历史文档中使用了以前的地点/设施名称,则不可能为它们分配纬度和经度值,即使它们的当前名称已在数据库中列出。此外,如果地理数据库中有多个相同的地点/设施名称,我们将不得不选择正确的一个。在本文中,我们提出了一种方法来构建包含当前和以前的地点/设施名称对的数据库。我们将基于机器学习的信息提取方法应用于一些文本语料库,自动提取当前和以前的地名/设施名称对。我们还提出了一种消除相同地点/设施名称歧义的方法。我们做了一些实验来证实我们方法的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信