Explicit and Implicit Section Identification from Clinical Discharge Summaries

Asim Abbas, Jamil Hussain, Muhammad Afzal, H. M. Bilal, Sungyoung Lee, Seokhee Jeon
{"title":"Explicit and Implicit Section Identification from Clinical Discharge Summaries","authors":"Asim Abbas, Jamil Hussain, Muhammad Afzal, H. M. Bilal, Sungyoung Lee, Seokhee Jeon","doi":"10.1109/IMCOM53663.2022.9721771","DOIUrl":null,"url":null,"abstract":"In the clinical domain, mostly data is generated in natural language and unstructured format in clinical notes, containing meaningful and hidden information. Various algorithms have been proposed to recognize and identify different sections within those clinical notes for easy conversion into a structured data format for further processing in terms of data storage and retrieval. The algorithm has proposed recognizing and identifying the explicit and implicit defined section heading and the start and end of boundaries for identified sections to enhance clinical notes’ information extraction (IE). A section dictionary is constructed contain explicit define section name. An exact term matching approach is used to identify explicit define section and term partially matching procedure is followed utilizing Levenshtein Distance algorithm. We evaluated our discharge summaries provided by Beth Israel Deaconess Medical Center 2010 I2b2 Challenge. The experiments showed that the proposed algorithm achieved a satisfactory score of precision 100%, recall 94.5%, and f-score 97.17% overall. We also perform experiments for the explicit section with a result score of 100% precision, 93.53% recall, and 96.66% f-score, and for the implicitly identified section, we gain precision of 100%, recall 95.46%, and 97.68%. The main goal of proposing this algorithm is to automatically prepare data for data-driven approaches like machine learning and deep learning and enhance the meaningful use of EHR’s and CDSS systems, clinical outcomes, and events.","PeriodicalId":367038,"journal":{"name":"2022 16th International Conference on Ubiquitous Information Management and Communication (IMCOM)","volume":"71 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 16th International Conference on Ubiquitous Information Management and Communication (IMCOM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IMCOM53663.2022.9721771","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

In the clinical domain, mostly data is generated in natural language and unstructured format in clinical notes, containing meaningful and hidden information. Various algorithms have been proposed to recognize and identify different sections within those clinical notes for easy conversion into a structured data format for further processing in terms of data storage and retrieval. The algorithm has proposed recognizing and identifying the explicit and implicit defined section heading and the start and end of boundaries for identified sections to enhance clinical notes’ information extraction (IE). A section dictionary is constructed contain explicit define section name. An exact term matching approach is used to identify explicit define section and term partially matching procedure is followed utilizing Levenshtein Distance algorithm. We evaluated our discharge summaries provided by Beth Israel Deaconess Medical Center 2010 I2b2 Challenge. The experiments showed that the proposed algorithm achieved a satisfactory score of precision 100%, recall 94.5%, and f-score 97.17% overall. We also perform experiments for the explicit section with a result score of 100% precision, 93.53% recall, and 96.66% f-score, and for the implicitly identified section, we gain precision of 100%, recall 95.46%, and 97.68%. The main goal of proposing this algorithm is to automatically prepare data for data-driven approaches like machine learning and deep learning and enhance the meaningful use of EHR’s and CDSS systems, clinical outcomes, and events.
临床出院摘要的显性和隐性剖面图识别
在临床领域,数据大多以自然语言和非结构化格式在临床笔记中生成,包含有意义和隐藏的信息。已经提出了各种算法来识别和识别这些临床记录中的不同部分,以便于将其转换为结构化数据格式,以便在数据存储和检索方面进行进一步处理。该算法提出了识别和识别明确和隐式定义的章节标题以及已识别章节的起始和结束边界,以增强临床笔记的信息提取。构造一个包含显式定义的节名的节字典。采用精确词匹配方法识别显式定义段,利用Levenshtein距离算法进行词部分匹配。我们评估了贝斯以色列女执事医疗中心2010 I2b2挑战赛提供的出院总结。实验表明,该算法达到了令人满意的精度100%,召回率94.5%,总体f值97.17%。我们还对显式部分进行了实验,结果准确率为100%,召回率为93.53%,f分数为96.66%,对于隐式识别部分,我们获得了100%的准确率,召回率为95.46%,97.68%。提出该算法的主要目标是为数据驱动的方法(如机器学习和深度学习)自动准备数据,并增强EHR和CDSS系统、临床结果和事件的有意义使用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信