法语专题语料库中地点提及识别的NER系统评价

Carmen Brando, Catherine Dominguès, Magali Capeyron
{"title":"法语专题语料库中地点提及识别的NER系统评价","authors":"Carmen Brando, Catherine Dominguès, Magali Capeyron","doi":"10.1145/3003464.3003471","DOIUrl":null,"url":null,"abstract":"Ongoing initiatives promoted by cultural institutions and public administrations engage in the development of textual corpora issued from the general public. In this work, we deal with a spoken corpus of life stories and a crowd-sourced Web corpus of people's contributions related to urban planning issues in their city. Located information constitutes an essential component in these corpora. Toponyms refer to official names (e.g. Congo) which are listed in gazetteers but often to generic locations such as un endroit très beau (a beautiful place). Because of the nature of the corpora, these generic locations are inherently subjective, vague and descriptive. For enabling automated exploitation of these texts, it is crucial to properly detect such kinds of place mentions. In this sense, the present work provides a comparative study of state-of-art NER1 systems, most importantly of supervised tools such as Stanford NER, for the identification of generic locations in thematic corpora.","PeriodicalId":308638,"journal":{"name":"Proceedings of the 10th Workshop on Geographic Information Retrieval","volume":"21 4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Evaluation of NER systems for the recognition of place mentions in French thematic corpora\",\"authors\":\"Carmen Brando, Catherine Dominguès, Magali Capeyron\",\"doi\":\"10.1145/3003464.3003471\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Ongoing initiatives promoted by cultural institutions and public administrations engage in the development of textual corpora issued from the general public. In this work, we deal with a spoken corpus of life stories and a crowd-sourced Web corpus of people's contributions related to urban planning issues in their city. Located information constitutes an essential component in these corpora. Toponyms refer to official names (e.g. Congo) which are listed in gazetteers but often to generic locations such as un endroit très beau (a beautiful place). Because of the nature of the corpora, these generic locations are inherently subjective, vague and descriptive. For enabling automated exploitation of these texts, it is crucial to properly detect such kinds of place mentions. In this sense, the present work provides a comparative study of state-of-art NER1 systems, most importantly of supervised tools such as Stanford NER, for the identification of generic locations in thematic corpora.\",\"PeriodicalId\":308638,\"journal\":{\"name\":\"Proceedings of the 10th Workshop on Geographic Information Retrieval\",\"volume\":\"21 4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-10-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 10th Workshop on Geographic Information Retrieval\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3003464.3003471\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 10th Workshop on Geographic Information Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3003464.3003471","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

摘要

文化机构和公共行政部门正在推动的举措是开发公众发布的文本语料库。在这项工作中,我们处理了生活故事的口语语料库和人们对其城市规划问题的贡献的众包网络语料库。定位信息是这些语料库的重要组成部分。地名指的是在地名词典中列出的官方名称(如刚果),但通常指的是一般的地点,如unendroit tr s beau(一个美丽的地方)。由于语料库的性质,这些通用位置本质上是主观的、模糊的和描述性的。为了实现对这些文本的自动利用,正确检测这类地点提及是至关重要的。从这个意义上说,本研究提供了对最先进的NER1系统的比较研究,最重要的是斯坦福NER等监督工具,用于识别主题语料库中的通用位置。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Evaluation of NER systems for the recognition of place mentions in French thematic corpora
Ongoing initiatives promoted by cultural institutions and public administrations engage in the development of textual corpora issued from the general public. In this work, we deal with a spoken corpus of life stories and a crowd-sourced Web corpus of people's contributions related to urban planning issues in their city. Located information constitutes an essential component in these corpora. Toponyms refer to official names (e.g. Congo) which are listed in gazetteers but often to generic locations such as un endroit très beau (a beautiful place). Because of the nature of the corpora, these generic locations are inherently subjective, vague and descriptive. For enabling automated exploitation of these texts, it is crucial to properly detect such kinds of place mentions. In this sense, the present work provides a comparative study of state-of-art NER1 systems, most importantly of supervised tools such as Stanford NER, for the identification of generic locations in thematic corpora.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信