Extracting and Aligning Timelines

Mark A. Finlayson, Andres Cremisini, M. Ocal
{"title":"Extracting and Aligning Timelines","authors":"Mark A. Finlayson, Andres Cremisini, M. Ocal","doi":"10.1017/9781108854221.006","DOIUrl":null,"url":null,"abstract":". Understanding the timeline of a story is a necessary first step for extracting storylines. This is difficult, because timelines are not explicitly given in documents, and parts of a story may be found across multiple documents, either repeated or in fragments. We outline prior work and the state of the art in both timeline extraction and alignment of timelines across documents. With regard to timeline extraction, there has been significant work over the past 40 years on representing temporal information in text, but most of it has focused on temporal graphs and not timelines. In the past 15 years researchers have begun to consider the problem of extracting timelines from these graphs, but the approaches have been incomplete and inexact. We review these approaches and describe recent work of our own that solves timeline extraction exactly. With regard to timeline alignment, most efforts have been focused only on the specific task of cross-document event coreference (CDEC). Current approaches to CDEC fall into two camps: event–only clustering and joint event–entity clustering, with joint clustering using neural methods achieving state-of-the-art performance. All CDEC approaches rely on document clustering to generate a tractable search space. We note both shortcomings and advantages of these various approaches and, importantly, we describe how CDEC falls short of full timeline alignment extraction. We outline next steps to advance the field toward full timeline alignment across documents that can serve as a foundation for extraction of higher-level, more abstract storylines.","PeriodicalId":170332,"journal":{"name":"Computational Analysis of Storylines","volume":"49 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational Analysis of Storylines","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1017/9781108854221.006","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

Abstract

. Understanding the timeline of a story is a necessary first step for extracting storylines. This is difficult, because timelines are not explicitly given in documents, and parts of a story may be found across multiple documents, either repeated or in fragments. We outline prior work and the state of the art in both timeline extraction and alignment of timelines across documents. With regard to timeline extraction, there has been significant work over the past 40 years on representing temporal information in text, but most of it has focused on temporal graphs and not timelines. In the past 15 years researchers have begun to consider the problem of extracting timelines from these graphs, but the approaches have been incomplete and inexact. We review these approaches and describe recent work of our own that solves timeline extraction exactly. With regard to timeline alignment, most efforts have been focused only on the specific task of cross-document event coreference (CDEC). Current approaches to CDEC fall into two camps: event–only clustering and joint event–entity clustering, with joint clustering using neural methods achieving state-of-the-art performance. All CDEC approaches rely on document clustering to generate a tractable search space. We note both shortcomings and advantages of these various approaches and, importantly, we describe how CDEC falls short of full timeline alignment extraction. We outline next steps to advance the field toward full timeline alignment across documents that can serve as a foundation for extraction of higher-level, more abstract storylines.
提取和调整时间线
. 了解故事的时间轴是提取故事情节的必要第一步。这很困难,因为文件中没有明确给出时间线,故事的某些部分可能在多个文件中出现,要么是重复的,要么是片段。我们概述了之前的工作以及在时间线提取和跨文档的时间线对齐方面的最新进展。在时间线提取方面,在过去的40年里,人们在以文本形式表示时间信息方面做了大量工作,但大多数工作都集中在时间图上,而不是时间线上。在过去的15年里,研究人员已经开始考虑从这些图表中提取时间线的问题,但这些方法都是不完整和不精确的。我们回顾了这些方法,并描述了我们自己最近解决时间线提取的工作。关于时间轴对齐,大多数工作只集中在跨文档事件共引用(CDEC)的特定任务上。当前的CDEC方法分为两个阵营:仅事件聚类和联合事件实体聚类,其中使用神经方法的联合聚类实现了最先进的性能。所有CDEC方法都依赖于文档聚类来生成易于处理的搜索空间。我们注意到这些不同方法的缺点和优点,重要的是,我们描述了CDEC如何缺乏完整的时间线对齐提取。我们概述了将该领域推进到跨文档的完整时间轴对齐的后续步骤,这些文档可以作为提取更高级别、更抽象的故事情节的基础。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信