识别真相:命名实体提取结果的聚合

International Conference on Information Integration and Web-based Applications & Services Pub Date : 2013-12-02 DOI:10.1145/2539150.2539160

Katja Pfeifer, J. Meinecke

{"title":"识别真相:命名实体提取结果的聚合","authors":"Katja Pfeifer, J. Meinecke","doi":"10.1145/2539150.2539160","DOIUrl":null,"url":null,"abstract":"Huge amounts of textual information relevant for market analysis, trending or product monitoring can be found on the Web. To exploit that knowledge a number of extraction services were proposed that extract and categorize entities from given text. Prior work showed that a combination of individual extractors can increase quality. However, so far no system exists that is fully applicable to reasonably combine real world extraction services that differ substantially in the entity types they extract and the schemata used. In this paper, we propose an aggregation system and a corresponding aggregation process that can be used for these services. We present a number of novel aggregation techniques that incorporate schema-information as well as entity extraction specific characteristics into the aggregation process. The aggregation system is broadly evaluated on six real world named entity recognition services and compared to state of the art approaches.","PeriodicalId":424918,"journal":{"name":"International Conference on Information Integration and Web-based Applications & Services","volume":"53 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Identifying the Truth: Aggregation of Named Entity Extraction Results\",\"authors\":\"Katja Pfeifer, J. Meinecke\",\"doi\":\"10.1145/2539150.2539160\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Huge amounts of textual information relevant for market analysis, trending or product monitoring can be found on the Web. To exploit that knowledge a number of extraction services were proposed that extract and categorize entities from given text. Prior work showed that a combination of individual extractors can increase quality. However, so far no system exists that is fully applicable to reasonably combine real world extraction services that differ substantially in the entity types they extract and the schemata used. In this paper, we propose an aggregation system and a corresponding aggregation process that can be used for these services. We present a number of novel aggregation techniques that incorporate schema-information as well as entity extraction specific characteristics into the aggregation process. The aggregation system is broadly evaluated on six real world named entity recognition services and compared to state of the art approaches.\",\"PeriodicalId\":424918,\"journal\":{\"name\":\"International Conference on Information Integration and Web-based Applications & Services\",\"volume\":\"53 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-12-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Conference on Information Integration and Web-based Applications & Services\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2539150.2539160\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Information Integration and Web-based Applications & Services","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2539150.2539160","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

在网络上可以找到大量与市场分析、趋势或产品监控相关的文本信息。为了利用这些知识，提出了一些从给定文本中提取实体并对其进行分类的提取服务。先前的工作表明，单个提取器的组合可以提高质量。然而，到目前为止，还没有一个系统可以完全适用于合理地组合现实世界中的提取服务，这些服务在提取的实体类型和使用的模式上存在很大差异。在本文中，我们提出了一个可以用于这些服务的聚合系统和相应的聚合过程。我们提出了一些新的聚合技术，这些技术将模式信息和实体提取的特定特征结合到聚合过程中。该聚合系统在六个真实世界的命名实体识别服务上进行了广泛的评估，并与最先进的方法进行了比较。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Identifying the Truth: Aggregation of Named Entity Extraction Results

Huge amounts of textual information relevant for market analysis, trending or product monitoring can be found on the Web. To exploit that knowledge a number of extraction services were proposed that extract and categorize entities from given text. Prior work showed that a combination of individual extractors can increase quality. However, so far no system exists that is fully applicable to reasonably combine real world extraction services that differ substantially in the entity types they extract and the schemata used. In this paper, we propose an aggregation system and a corresponding aggregation process that can be used for these services. We present a number of novel aggregation techniques that incorporate schema-information as well as entity extraction specific characteristics into the aggregation process. The aggregation system is broadly evaluated on six real world named entity recognition services and compared to state of the art approaches.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Conference on Information Integration and Web-based Applications & Services

自引率

0.00%

发文量