Monitoring Entities in an Uncertain World: Entity Resolution and Referential Integrity

Steven Minton, Sofus A. Macskassy, P. LaMonica, Kane See, Craig A. Knoblock, Greg Barish, M. Michelson, R. Liuzzi
{"title":"Monitoring Entities in an Uncertain World: Entity Resolution and Referential Integrity","authors":"Steven Minton, Sofus A. Macskassy, P. LaMonica, Kane See, Craig A. Knoblock, Greg Barish, M. Michelson, R. Liuzzi","doi":"10.1609/aaai.v25i2.18860","DOIUrl":null,"url":null,"abstract":"\n \n \nThis paper describes a system to help intelligence analysts track and analyze information being published in multiple sources, particularly open sources on the Web. The system integrates technology for Web harvesting, natural language extraction, and network analytics, and allows analysts to view and explore the results via a Web application. One of the difficult problems we address is the entity resolution problem, which occurs when there are multiple, differing ways to refer to the same entity. The problem is particularly complex when noisy data is being aggregated over time, there is no clean master list of entities, and the entities under investigation are intentionally being deceptive. Our system must not only perform entity resolution with noisy data, but must also gracefully recover when entity resolution mistakes are subsequently corrected. We present a case study in arms trafficking that illustrates the issues, and describe how they are addressed. \n \n \n","PeriodicalId":408078,"journal":{"name":"Conference on Innovative Applications of Artificial Intelligence","volume":"43 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Conference on Innovative Applications of Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1609/aaai.v25i2.18860","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

Abstract

This paper describes a system to help intelligence analysts track and analyze information being published in multiple sources, particularly open sources on the Web. The system integrates technology for Web harvesting, natural language extraction, and network analytics, and allows analysts to view and explore the results via a Web application. One of the difficult problems we address is the entity resolution problem, which occurs when there are multiple, differing ways to refer to the same entity. The problem is particularly complex when noisy data is being aggregated over time, there is no clean master list of entities, and the entities under investigation are intentionally being deceptive. Our system must not only perform entity resolution with noisy data, but must also gracefully recover when entity resolution mistakes are subsequently corrected. We present a case study in arms trafficking that illustrates the issues, and describe how they are addressed.
不确定世界中的监测实体:实体解析和参考完整性
本文描述了一个帮助情报分析人员跟踪和分析在多个来源发布的信息的系统,特别是在Web上的开放资源。该系统集成了Web收集、自然语言提取和网络分析技术,并允许分析人员通过Web应用程序查看和探索结果。我们要解决的一个难题是实体解析问题,当有多种不同的方法来引用同一个实体时,就会出现这种问题。当嘈杂的数据随着时间的推移而聚合,没有一个清晰的实体主列表,并且被调查的实体故意欺骗时,问题就特别复杂。我们的系统不仅必须对有噪声的数据进行实体解析,还必须在实体解析错误随后被纠正时优雅地恢复。我们提出了一个武器贩运的案例研究,说明了这些问题,并描述了如何解决这些问题。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信