Information Extraction

Jerry R. Hobbs
{"title":"Information Extraction","authors":"Jerry R. Hobbs","doi":"10.1201/9781420085938-c21","DOIUrl":null,"url":null,"abstract":"Information Extraction (IE) techniques aim to extract the names of entities and objects from text and to identify the roles that they play in event descriptions. IE systems generally focus on a specific domain or topic, searching only for information that is relevant to a user's interests. In this chapter, we first give historical background on information extraction and discuss several kinds of information extraction tasks that have emerged in recent years. Next, we outline the series of steps that are involved in creating a typical information extraction system, which can be encoded as a cascaded finite-state transducer. Along the way, we present examples to illustrate what each step does. Finally, we present an overview of different learning-based methods for information extraction, including supervised learning approaches, weakly supervised and bootstrapping techniques, and discourse-oriented approaches. Information extraction (IE) is the process of scanning text for information relevant to some interest, including extracting entities, relations, and, most challenging, events–or who did what to whom when and where. It requires deeper analysis than key word searches, but its aims fall short of the very hard and long-term problem of text understanding, where we seek to capture all the information in a text, along with the speaker's or writer's intention.","PeriodicalId":361311,"journal":{"name":"Handbook of Natural Language Processing","volume":"55 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Handbook of Natural Language Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1201/9781420085938-c21","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Information Extraction (IE) techniques aim to extract the names of entities and objects from text and to identify the roles that they play in event descriptions. IE systems generally focus on a specific domain or topic, searching only for information that is relevant to a user's interests. In this chapter, we first give historical background on information extraction and discuss several kinds of information extraction tasks that have emerged in recent years. Next, we outline the series of steps that are involved in creating a typical information extraction system, which can be encoded as a cascaded finite-state transducer. Along the way, we present examples to illustrate what each step does. Finally, we present an overview of different learning-based methods for information extraction, including supervised learning approaches, weakly supervised and bootstrapping techniques, and discourse-oriented approaches. Information extraction (IE) is the process of scanning text for information relevant to some interest, including extracting entities, relations, and, most challenging, events–or who did what to whom when and where. It requires deeper analysis than key word searches, but its aims fall short of the very hard and long-term problem of text understanding, where we seek to capture all the information in a text, along with the speaker's or writer's intention.
信息提取
信息提取(Information Extraction, IE)技术旨在从文本中提取实体和对象的名称,并确定它们在事件描述中所扮演的角色。IE系统通常关注特定的领域或主题,只搜索与用户兴趣相关的信息。在本章中,我们首先给出了信息抽取的历史背景,并讨论了近年来出现的几种信息抽取任务。接下来,我们概述了创建一个典型的信息提取系统所涉及的一系列步骤,该系统可以编码为级联有限状态传感器。在此过程中,我们将提供示例来说明每个步骤的作用。最后,我们概述了不同的基于学习的信息提取方法,包括监督学习方法、弱监督和自举技术以及面向话语的方法。信息提取(IE)是扫描文本以获取与某些兴趣相关的信息的过程,包括提取实体、关系和(最具挑战性的)事件——或者谁在何时何地对谁做了什么。它需要比关键词搜索更深入的分析,但它的目标没有达到文本理解这个非常困难和长期的问题,在文本理解中,我们试图捕捉文本中的所有信息,以及说话者或作者的意图。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信