Temporal Relation Extraction in Clinical Texts

Yohan Bonescki Gumiel, Lucas Emanuel Silva e Oliveira, V. Claveau, N. Grabar, E. Paraiso, C. Moro, D. Carvalho
{"title":"Temporal Relation Extraction in Clinical Texts","authors":"Yohan Bonescki Gumiel, Lucas Emanuel Silva e Oliveira, V. Claveau, N. Grabar, E. Paraiso, C. Moro, D. Carvalho","doi":"10.1145/3462475","DOIUrl":null,"url":null,"abstract":"Unstructured data in electronic health records, represented by clinical texts, are a vast source of healthcare information because they describe a patient's journey, including clinical findings, procedures, and information about the continuity of care. The publication of several studies on temporal relation extraction from clinical texts during the last decade and the realization of multiple shared tasks highlight the importance of this research theme. Therefore, we propose a review of temporal relation extraction in clinical texts. We analyzed 105 articles and verified that relations between events and document creation time, a coarse temporality type, were addressed with traditional machine learning–based models with few recent initiatives to push the state-of-the-art with deep learning–based models. For temporal relations between entities (event and temporal expressions) in the document, factors such as dataset imbalance because of candidate pair generation and task complexity directly affect the system's performance. The state-of-the-art resides on attention-based models, with contextualized word representations being fine-tuned for temporal relation extraction. However, further experiments and advances in the research topic are required until real-time clinical domain applications are released. Furthermore, most of the publications mainly reside on the same dataset, hindering the need for new annotation projects that provide datasets for different medical specialties, clinical text types, and even languages.","PeriodicalId":7000,"journal":{"name":"ACM Computing Surveys (CSUR)","volume":"92 1","pages":"1 - 36"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Computing Surveys (CSUR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3462475","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10

Abstract

Unstructured data in electronic health records, represented by clinical texts, are a vast source of healthcare information because they describe a patient's journey, including clinical findings, procedures, and information about the continuity of care. The publication of several studies on temporal relation extraction from clinical texts during the last decade and the realization of multiple shared tasks highlight the importance of this research theme. Therefore, we propose a review of temporal relation extraction in clinical texts. We analyzed 105 articles and verified that relations between events and document creation time, a coarse temporality type, were addressed with traditional machine learning–based models with few recent initiatives to push the state-of-the-art with deep learning–based models. For temporal relations between entities (event and temporal expressions) in the document, factors such as dataset imbalance because of candidate pair generation and task complexity directly affect the system's performance. The state-of-the-art resides on attention-based models, with contextualized word representations being fine-tuned for temporal relation extraction. However, further experiments and advances in the research topic are required until real-time clinical domain applications are released. Furthermore, most of the publications mainly reside on the same dataset, hindering the need for new annotation projects that provide datasets for different medical specialties, clinical text types, and even languages.
临床文献中的时间关系提取
以临床文本为代表的电子健康记录中的非结构化数据是医疗保健信息的巨大来源,因为它们描述了患者的旅程,包括临床发现、程序和有关护理连续性的信息。在过去的十年中,一些关于从临床文本中提取时间关系的研究的发表和多个共享任务的实现突出了这一研究主题的重要性。因此,我们建议对临床文献中的时间关系提取进行回顾。我们分析了105篇文章,并验证了事件和文档创建时间(一种粗略的时间类型)之间的关系是用传统的基于机器学习的模型来解决的,而最近很少有基于深度学习的模型来推动最先进的技术。对于文档中实体(事件和时态表达式)之间的时态关系,候选对生成导致的数据集不平衡和任务复杂性等因素直接影响系统的性能。最先进的技术是基于注意力的模型,对上下文化的单词表示进行了微调,以提取时间关系。然而,在实时临床领域应用发布之前,还需要进一步的实验和研究课题的进展。此外,大多数出版物主要驻留在相同的数据集上,阻碍了对新的注释项目的需求,这些项目为不同的医学专业、临床文本类型甚至语言提供数据集。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信