从文献中提取空间关系,用于地理信息检索

Yecheng Yuan
{"title":"从文献中提取空间关系,用于地理信息检索","authors":"Yecheng Yuan","doi":"10.1109/GEOINFORMATICS.2011.5980797","DOIUrl":null,"url":null,"abstract":"Geographic information retrieval (GIR) is developed to retrieve geographical information from unstructured text (commonly web documents). Previous researches focus on applying traditional information retrieval (IR) techniques to GIR, such as ranking geographic relevance by vector space model (VSM). In many cases, these keyword-based methods can not support spatial query very well. For example, searching documents on “debris flow took place in Hunan last year”, the documents selected in this way may only contain the words “debris flow” and “Hunan” rather than refer to “debris flow actually occurred in Hunan”. Lack of spatial relations between thematic activates (debris flow) and geographic entities (Hunan) is the key reason for this problem. In this paper, we present a kernel-based approach and apply it in support vector machine (SVM) to extract spatial relations from free text for further GIS service and spatial reasoning. First, we analyze the characters of spatial relation expressions in natural language and there are two types of spatial relations: topology and direction. Both of them are used to qualitatively describe the relative positions of spatial objects to each other. Then we explore the use of dependency tree (a dependency tree represents the grammatical dependencies in a sentence and it can be generated by syntax parser) to identify these spatial relations. We observe that the features required to find a relationship between two spatial named entities in the same sentence is typically captured by the shortest path between the two entities in the dependency tree. Therefore, we construct a shortest path dependency kernel for SVM to complete the task. The experiment results show that our dependency tree kernel achieves significant improvement than previous method.","PeriodicalId":413886,"journal":{"name":"2011 19th International Conference on Geoinformatics","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"Extracting spatial relations from document for geographic information retrieval\",\"authors\":\"Yecheng Yuan\",\"doi\":\"10.1109/GEOINFORMATICS.2011.5980797\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Geographic information retrieval (GIR) is developed to retrieve geographical information from unstructured text (commonly web documents). Previous researches focus on applying traditional information retrieval (IR) techniques to GIR, such as ranking geographic relevance by vector space model (VSM). In many cases, these keyword-based methods can not support spatial query very well. For example, searching documents on “debris flow took place in Hunan last year”, the documents selected in this way may only contain the words “debris flow” and “Hunan” rather than refer to “debris flow actually occurred in Hunan”. Lack of spatial relations between thematic activates (debris flow) and geographic entities (Hunan) is the key reason for this problem. In this paper, we present a kernel-based approach and apply it in support vector machine (SVM) to extract spatial relations from free text for further GIS service and spatial reasoning. First, we analyze the characters of spatial relation expressions in natural language and there are two types of spatial relations: topology and direction. Both of them are used to qualitatively describe the relative positions of spatial objects to each other. Then we explore the use of dependency tree (a dependency tree represents the grammatical dependencies in a sentence and it can be generated by syntax parser) to identify these spatial relations. We observe that the features required to find a relationship between two spatial named entities in the same sentence is typically captured by the shortest path between the two entities in the dependency tree. Therefore, we construct a shortest path dependency kernel for SVM to complete the task. The experiment results show that our dependency tree kernel achieves significant improvement than previous method.\",\"PeriodicalId\":413886,\"journal\":{\"name\":\"2011 19th International Conference on Geoinformatics\",\"volume\":\"13 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-06-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 19th International Conference on Geoinformatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/GEOINFORMATICS.2011.5980797\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 19th International Conference on Geoinformatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/GEOINFORMATICS.2011.5980797","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10

摘要

地理信息检索(GIR)是一种从非结构化文本(通常是网络文档)中检索地理信息的技术。以往的研究主要集中在将传统的信息检索技术应用到地理信息检索中,如利用向量空间模型(VSM)对地理相关性进行排序。在很多情况下,这些基于关键字的方法不能很好地支持空间查询。例如,搜索“去年湖南发生泥石流”的文件,这样选择的文件可能只包含“泥石流”和“湖南”两个词,而不涉及“湖南实际发生泥石流”。专题活动(泥石流)与地理实体(湖南)之间缺乏空间关系是造成这一问题的主要原因。本文提出了一种基于核的方法,并将其应用于支持向量机(SVM)中,从自由文本中提取空间关系,用于进一步的GIS服务和空间推理。首先,我们分析了自然语言中空间关系表达的特点,空间关系有拓扑关系和方向关系两种类型。它们都用来定性地描述空间物体彼此之间的相对位置。然后,我们探索了使用依赖树(依赖树表示句子中的语法依赖关系,可以由语法解析器生成)来识别这些空间关系。我们观察到,在同一句子中找到两个空间命名实体之间的关系所需的特征通常由依赖树中两个实体之间的最短路径捕获。因此,我们为SVM构造一个最短路径依赖核来完成任务。实验结果表明,我们的依赖树核比以前的方法有了明显的改进。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Extracting spatial relations from document for geographic information retrieval
Geographic information retrieval (GIR) is developed to retrieve geographical information from unstructured text (commonly web documents). Previous researches focus on applying traditional information retrieval (IR) techniques to GIR, such as ranking geographic relevance by vector space model (VSM). In many cases, these keyword-based methods can not support spatial query very well. For example, searching documents on “debris flow took place in Hunan last year”, the documents selected in this way may only contain the words “debris flow” and “Hunan” rather than refer to “debris flow actually occurred in Hunan”. Lack of spatial relations between thematic activates (debris flow) and geographic entities (Hunan) is the key reason for this problem. In this paper, we present a kernel-based approach and apply it in support vector machine (SVM) to extract spatial relations from free text for further GIS service and spatial reasoning. First, we analyze the characters of spatial relation expressions in natural language and there are two types of spatial relations: topology and direction. Both of them are used to qualitatively describe the relative positions of spatial objects to each other. Then we explore the use of dependency tree (a dependency tree represents the grammatical dependencies in a sentence and it can be generated by syntax parser) to identify these spatial relations. We observe that the features required to find a relationship between two spatial named entities in the same sentence is typically captured by the shortest path between the two entities in the dependency tree. Therefore, we construct a shortest path dependency kernel for SVM to complete the task. The experiment results show that our dependency tree kernel achieves significant improvement than previous method.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信