Temporal Relation Extraction in Clinical Texts

ACM Computing Surveys (CSUR) Pub Date : 2021-09-17 DOI:10.1145/3462475

Yohan Bonescki Gumiel, Lucas Emanuel Silva e Oliveira, V. Claveau, N. Grabar, E. Paraiso, C. Moro, D. Carvalho

{"title":"Temporal Relation Extraction in Clinical Texts","authors":"Yohan Bonescki Gumiel, Lucas Emanuel Silva e Oliveira, V. Claveau, N. Grabar, E. Paraiso, C. Moro, D. Carvalho","doi":"10.1145/3462475","DOIUrl":null,"url":null,"abstract":"Unstructured data in electronic health records, represented by clinical texts, are a vast source of healthcare information because they describe a patient's journey, including clinical findings, procedures, and information about the continuity of care. The publication of several studies on temporal relation extraction from clinical texts during the last decade and the realization of multiple shared tasks highlight the importance of this research theme. Therefore, we propose a review of temporal relation extraction in clinical texts. We analyzed 105 articles and verified that relations between events and document creation time, a coarse temporality type, were addressed with traditional machine learning–based models with few recent initiatives to push the state-of-the-art with deep learning–based models. For temporal relations between entities (event and temporal expressions) in the document, factors such as dataset imbalance because of candidate pair generation and task complexity directly affect the system's performance. The state-of-the-art resides on attention-based models, with contextualized word representations being fine-tuned for temporal relation extraction. However, further experiments and advances in the research topic are required until real-time clinical domain applications are released. Furthermore, most of the publications mainly reside on the same dataset, hindering the need for new annotation projects that provide datasets for different medical specialties, clinical text types, and even languages.","PeriodicalId":7000,"journal":{"name":"ACM Computing Surveys (CSUR)","volume":"92 1","pages":"1 - 36"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Computing Surveys (CSUR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3462475","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 10

Abstract

Unstructured data in electronic health records, represented by clinical texts, are a vast source of healthcare information because they describe a patient's journey, including clinical findings, procedures, and information about the continuity of care. The publication of several studies on temporal relation extraction from clinical texts during the last decade and the realization of multiple shared tasks highlight the importance of this research theme. Therefore, we propose a review of temporal relation extraction in clinical texts. We analyzed 105 articles and verified that relations between events and document creation time, a coarse temporality type, were addressed with traditional machine learning–based models with few recent initiatives to push the state-of-the-art with deep learning–based models. For temporal relations between entities (event and temporal expressions) in the document, factors such as dataset imbalance because of candidate pair generation and task complexity directly affect the system's performance. The state-of-the-art resides on attention-based models, with contextualized word representations being fine-tuned for temporal relation extraction. However, further experiments and advances in the research topic are required until real-time clinical domain applications are released. Furthermore, most of the publications mainly reside on the same dataset, hindering the need for new annotation projects that provide datasets for different medical specialties, clinical text types, and even languages.

查看原文本刊更多论文

临床文献中的时间关系提取

以临床文本为代表的电子健康记录中的非结构化数据是医疗保健信息的巨大来源，因为它们描述了患者的旅程，包括临床发现、程序和有关护理连续性的信息。在过去的十年中，一些关于从临床文本中提取时间关系的研究的发表和多个共享任务的实现突出了这一研究主题的重要性。因此，我们建议对临床文献中的时间关系提取进行回顾。我们分析了105篇文章，并验证了事件和文档创建时间(一种粗略的时间类型)之间的关系是用传统的基于机器学习的模型来解决的，而最近很少有基于深度学习的模型来推动最先进的技术。对于文档中实体(事件和时态表达式)之间的时态关系，候选对生成导致的数据集不平衡和任务复杂性等因素直接影响系统的性能。最先进的技术是基于注意力的模型，对上下文化的单词表示进行了微调，以提取时间关系。然而，在实时临床领域应用发布之前，还需要进一步的实验和研究课题的进展。此外，大多数出版物主要驻留在相同的数据集上，阻碍了对新的注释项目的需求，这些项目为不同的医学专业、临床文本类型甚至语言提供数据集。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ACM Computing Surveys (CSUR)

自引率

0.00%

发文量