网络报纸文章的重叠目标事件与故事线检测

Yifang Wei, L. Singh, Brian Gallagher, David J. Buttler
{"title":"网络报纸文章的重叠目标事件与故事线检测","authors":"Yifang Wei, L. Singh, Brian Gallagher, David J. Buttler","doi":"10.1109/DSAA.2016.30","DOIUrl":null,"url":null,"abstract":"Event detection from text data is an active area of research. While the emphasis has been on event identification and labeling using a single data source, this work considers event and story line detection when using a large number of data sources. In this setting, it is natural for different events in the same domain, e.g. violence, sports, politics, to occur at the same time and for different story lines about the same event to emerge. To capture events in this setting, we propose an algorithm that detects events and story lines about events for a target domain. Our algorithm leverages a multi-relational sentence level semantic graph and well known graph properties to identify overlapping events and story lines within the events. We evaluate our approach on two large data sets containing millions of news articles from a large number of sources. Our empirical analysis shows that our approach improves the detection precision and recall by 10% to 25%, while providing complete event summaries.","PeriodicalId":193885,"journal":{"name":"2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA)","volume":"157 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"Overlapping Target Event and Story Line Detection of Online Newspaper Articles\",\"authors\":\"Yifang Wei, L. Singh, Brian Gallagher, David J. Buttler\",\"doi\":\"10.1109/DSAA.2016.30\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Event detection from text data is an active area of research. While the emphasis has been on event identification and labeling using a single data source, this work considers event and story line detection when using a large number of data sources. In this setting, it is natural for different events in the same domain, e.g. violence, sports, politics, to occur at the same time and for different story lines about the same event to emerge. To capture events in this setting, we propose an algorithm that detects events and story lines about events for a target domain. Our algorithm leverages a multi-relational sentence level semantic graph and well known graph properties to identify overlapping events and story lines within the events. We evaluate our approach on two large data sets containing millions of news articles from a large number of sources. Our empirical analysis shows that our approach improves the detection precision and recall by 10% to 25%, while providing complete event summaries.\",\"PeriodicalId\":193885,\"journal\":{\"name\":\"2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA)\",\"volume\":\"157 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DSAA.2016.30\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DSAA.2016.30","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10

摘要

从文本数据中检测事件是一个活跃的研究领域。虽然重点是使用单个数据源进行事件识别和标记,但这项工作在使用大量数据源时考虑了事件和故事线检测。在这种情况下,同一领域的不同事件(如暴力、体育、政治)在同一时间发生,同一事件的不同故事线出现是很自然的。为了捕获这种设置中的事件,我们提出了一种算法来检测目标域的事件和关于事件的故事线。我们的算法利用多关系句子级语义图和众所周知的图属性来识别重叠事件和事件中的故事线。我们在两个大型数据集上评估了我们的方法,这些数据集包含来自大量来源的数百万篇新闻文章。我们的实证分析表明,我们的方法在提供完整的事件摘要的同时,将检测精度和召回率提高了10%到25%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Overlapping Target Event and Story Line Detection of Online Newspaper Articles
Event detection from text data is an active area of research. While the emphasis has been on event identification and labeling using a single data source, this work considers event and story line detection when using a large number of data sources. In this setting, it is natural for different events in the same domain, e.g. violence, sports, politics, to occur at the same time and for different story lines about the same event to emerge. To capture events in this setting, we propose an algorithm that detects events and story lines about events for a target domain. Our algorithm leverages a multi-relational sentence level semantic graph and well known graph properties to identify overlapping events and story lines within the events. We evaluate our approach on two large data sets containing millions of news articles from a large number of sources. Our empirical analysis shows that our approach improves the detection precision and recall by 10% to 25%, while providing complete event summaries.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信