基于事件的流行病情报中回顾性事件检测的统一方法。

IF 1.6 Q2 INFORMATION SCIENCE & LIBRARY SCIENCE

International Journal on Digital Libraries Pub Date : 2021-01-01 Epub Date: 2021-10-09 DOI:10.1007/s00799-021-00308-9

Marco Fisichella

{"title":"基于事件的流行病情报中回顾性事件检测的统一方法。","authors":"Marco Fisichella","doi":"10.1007/s00799-021-00308-9","DOIUrl":null,"url":null,"abstract":"Inferring the magnitude and occurrence of real-world events from natural language text is a crucial task in various domains. Particularly in the domain of public health, the state-of-the-art document and token centric event detection approaches have not kept the pace with the growing need for more robust event detection in public health. In this paper, we propose UPHED, a unified approach, which combines both the document and token centric event detection techniques in an unsupervised manner such that events which are: rare (aperiodic); reoccurring (periodic) can be detected using a generative model for the domain of public health. We evaluate the efficiency of our approach as well as its effectiveness for two real-world case studies with respect to the quality of document clusters. Our results show that we are able to achieve a precision of 60% and a recall of 71% analyzed using manually annotated real-world data. Finally, we also make a comparative analysis of our work with the well-established rule-based system of MedISys and find that UPHED can be used in a cooperative way with MedISys to not only detect similar anomalies, but can also deliver more information about the specific outbreak of reported diseases.","PeriodicalId":44974,"journal":{"name":"International Journal on Digital Libraries","volume":"22 4","pages":"339-364"},"PeriodicalIF":1.6000,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8502099/pdf/","citationCount":"2","resultStr":"{\"title\":\"Unified approach to retrospective event detection for event- based epidemic intelligence.\",\"authors\":\"Marco Fisichella\",\"doi\":\"10.1007/s00799-021-00308-9\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Inferring the magnitude and occurrence of real-world events from natural language text is a crucial task in various domains. Particularly in the domain of public health, the state-of-the-art document and token centric event detection approaches have not kept the pace with the growing need for more robust event detection in public health. In this paper, we propose UPHED, a unified approach, which combines both the document and token centric event detection techniques in an unsupervised manner such that events which are: rare (aperiodic); reoccurring (periodic) can be detected using a generative model for the domain of public health. We evaluate the efficiency of our approach as well as its effectiveness for two real-world case studies with respect to the quality of document clusters. Our results show that we are able to achieve a precision of 60% and a recall of 71% analyzed using manually annotated real-world data. Finally, we also make a comparative analysis of our work with the well-established rule-based system of MedISys and find that UPHED can be used in a cooperative way with MedISys to not only detect similar anomalies, but can also deliver more information about the specific outbreak of reported diseases.\",\"PeriodicalId\":44974,\"journal\":{\"name\":\"International Journal on Digital Libraries\",\"volume\":\"22 4\",\"pages\":\"339-364\"},\"PeriodicalIF\":1.6000,\"publicationDate\":\"2021-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8502099/pdf/\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal on Digital Libraries\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1007/s00799-021-00308-9\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2021/10/9 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q2\",\"JCRName\":\"INFORMATION SCIENCE & LIBRARY SCIENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal on Digital Libraries","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s00799-021-00308-9","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2021/10/9 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"INFORMATION SCIENCE & LIBRARY SCIENCE","Score":null,"Total":0}

引用次数: 2

摘要

从自然语言文本中推断现实世界事件的大小和发生是各个领域的关键任务。特别是在公共卫生领域，最先进的以文件和令牌为中心的事件检测方法未能跟上公共卫生领域对更强大的事件检测日益增长的需求。在本文中，我们提出了一种统一的方法UPHED，它以无监督的方式结合了以文档和令牌为中心的事件检测技术，使得事件:罕见(非周期性);使用公共卫生领域的生成模型可以发现重复发生(周期性)的情况。我们评估了我们的方法的效率，以及它在两个关于文档集群质量的现实世界案例研究中的有效性。我们的结果表明，我们能够实现60%的精度和71%的召回分析使用手动注释的真实世界的数据。最后，我们还将我们的工作与MedISys完善的基于规则的系统进行了比较分析，发现UPHED可以与MedISys合作使用，不仅可以发现类似的异常，而且可以提供更多关于报告疾病具体爆发的信息。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Unified approach to retrospective event detection for event- based epidemic intelligence.

查看原文本刊更多论文

Unified approach to retrospective event detection for event- based epidemic intelligence.

Inferring the magnitude and occurrence of real-world events from natural language text is a crucial task in various domains. Particularly in the domain of public health, the state-of-the-art document and token centric event detection approaches have not kept the pace with the growing need for more robust event detection in public health. In this paper, we propose UPHED, a unified approach, which combines both the document and token centric event detection techniques in an unsupervised manner such that events which are: rare (aperiodic); reoccurring (periodic) can be detected using a generative model for the domain of public health. We evaluate the efficiency of our approach as well as its effectiveness for two real-world case studies with respect to the quality of document clusters. Our results show that we are able to achieve a precision of 60% and a recall of 71% analyzed using manually annotated real-world data. Finally, we also make a comparative analysis of our work with the well-established rule-based system of MedISys and find that UPHED can be used in a cooperative way with MedISys to not only detect similar anomalies, but can also deliver more information about the specific outbreak of reported diseases.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Journal on Digital Libraries

CiteScore

4.30

自引率

6.70%

发文量

期刊介绍： The International Journal on Digital Libraries (IJDL) examines the theory and practice of acquisition definition organization management preservation and dissemination of digital information via global networking. It covers all aspects of digital libraries (DLs) from large-scale heterogeneous data and information management & access to linking and connectivity to security privacy and policies to its application use and evaluation.The scope of IJDL includes but is not limited to: The FAIR principle and the digital libraries infrastructure Findable: Information access and retrieval; semantic search; data and information exploration; information navigation; smart indexing and searching; resource discovery Accessible: visualization and digital collections; user interfaces; interfaces for handicapped users; HCI and UX in DLs; Security and privacy in DLs; multimodal access Interoperable: metadata (definition management curation integration); syntactic and semantic interoperability; linked data Reusable: reproducibility; Open Science; sustainability profitability repeatability of research results; confidentiality and privacy issues in DLs Digital Library Architectures including heterogeneous and dynamic data management; data and repositories Acquisition of digital information: authoring environments for digital objects; digitization of traditional content Digital Archiving and Preservation Digital Preservation and curation Digital archiving Web Archiving Archiving and preservation Strategies AI for Digital Libraries Machine Learning for DLs Data Mining in DLs NLP for DLs Applications of Digital Libraries Digital Humanities Open Data and their reuse Scholarly DLs (incl. bibliometrics altmetrics) Epigraphy and Paleography Digital Museums Future trends in Digital Libraries Definition of DLs in a ubiquitous digital library world Datafication of digital collections Interaction and user experience (UX) in DLs Information visualization Collection understanding Privacy and security Multimodal user interfaces Accessibility (or "Access for users with disabilities") UX studies