The Pipeline for Standardizing Russian Unstructured Allergy Anamnesis Using FHIR AllergyIntolerance Resource.

IF 1.3 4区 医学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS
Methods of Information in Medicine Pub Date : 2021-09-01 Epub Date: 2021-08-23 DOI:10.1055/s-0041-1733945
Iuliia D Lenivtceva, Georgy Kopanitsa
{"title":"The Pipeline for Standardizing Russian Unstructured Allergy Anamnesis Using FHIR AllergyIntolerance Resource.","authors":"Iuliia D Lenivtceva,&nbsp;Georgy Kopanitsa","doi":"10.1055/s-0041-1733945","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>The larger part of essential medical knowledge is stored as free text which is complicated to process. Standardization of medical narratives is an important task for data exchange, integration, and semantic interoperability.</p><p><strong>Objectives: </strong>The article aims to develop the end-to-end pipeline for structuring Russian free-text allergy anamnesis using international standards.</p><p><strong>Methods: </strong>The pipeline for free-text data standardization is based on FHIR (Fast Healthcare Interoperability Resources) and SNOMED CT (Systematized Nomenclature of Medicine Clinical Terms) to ensure semantic interoperability. The pipeline solves common tasks such as data preprocessing, classification, categorization, entities extraction, and semantic codes assignment. Machine learning methods, rule-based, and dictionary-based approaches were used to compose the pipeline. The pipeline was evaluated on 166 randomly chosen medical records.</p><p><strong>Results: </strong>AllergyIntolerance resource was used to represent allergy anamnesis. The module for data preprocessing included the dictionary with over 90,000 words, including specific medication terms, and more than 20 regular expressions for errors correction, classification, and categorization modules resulted in four dictionaries with allergy terms (total 2,675 terms), which were mapped to SNOMED CT concepts. F-scores for different steps are: 0.945 for filtering, 0.90 to 0.96 for allergy categorization, 0.90 and 0.93 for allergens reactions extraction, respectively. The allergy terminology coverage is more than 95%.</p><p><strong>Conclusion: </strong>The proposed pipeline is a step to ensure semantic interoperability of Russian free-text medical records and could be effective in standardization systems for further data exchange and integration.</p>","PeriodicalId":49822,"journal":{"name":"Methods of Information in Medicine","volume":"60 3-04","pages":"95-103"},"PeriodicalIF":1.3000,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Methods of Information in Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1055/s-0041-1733945","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2021/8/23 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 4

Abstract

Background: The larger part of essential medical knowledge is stored as free text which is complicated to process. Standardization of medical narratives is an important task for data exchange, integration, and semantic interoperability.

Objectives: The article aims to develop the end-to-end pipeline for structuring Russian free-text allergy anamnesis using international standards.

Methods: The pipeline for free-text data standardization is based on FHIR (Fast Healthcare Interoperability Resources) and SNOMED CT (Systematized Nomenclature of Medicine Clinical Terms) to ensure semantic interoperability. The pipeline solves common tasks such as data preprocessing, classification, categorization, entities extraction, and semantic codes assignment. Machine learning methods, rule-based, and dictionary-based approaches were used to compose the pipeline. The pipeline was evaluated on 166 randomly chosen medical records.

Results: AllergyIntolerance resource was used to represent allergy anamnesis. The module for data preprocessing included the dictionary with over 90,000 words, including specific medication terms, and more than 20 regular expressions for errors correction, classification, and categorization modules resulted in four dictionaries with allergy terms (total 2,675 terms), which were mapped to SNOMED CT concepts. F-scores for different steps are: 0.945 for filtering, 0.90 to 0.96 for allergy categorization, 0.90 and 0.93 for allergens reactions extraction, respectively. The allergy terminology coverage is more than 95%.

Conclusion: The proposed pipeline is a step to ensure semantic interoperability of Russian free-text medical records and could be effective in standardization systems for further data exchange and integration.

利用FHIR过敏症耐受资源标准化俄罗斯非结构化过敏记忆的管道。
背景:大部分医学基础知识以自由文本的形式存储,处理起来比较复杂。医学叙事的标准化是数据交换、集成和语义互操作性的重要任务。目的:本文旨在开发端到端的管道结构俄语自由文本过敏记忆使用国际标准。方法:基于FHIR (Fast Healthcare Interoperability Resources)和SNOMED CT (system系统化医学临床术语命名法)构建自由文本数据标准化管道,确保语义互操作性。该管道解决了常见的任务,如数据预处理、分类、分类、实体提取和语义代码分配。使用机器学习方法、基于规则的方法和基于字典的方法来组成管道。研究人员对随机选择的166份医疗记录进行了评估。结果:allergintolerance资源代表过敏反应记忆。数据预处理模块包括9万多个单词的字典,包括特定的药物术语,20多个正则表达式用于纠错、分类和分类模块,产生4个包含过敏术语的字典(总计2675个术语),并将其映射到SNOMED CT概念。不同步骤的f值分别为:过滤为0.945,过敏分类为0.90 ~ 0.96,过敏原反应提取为0.90 ~ 0.93。过敏术语的覆盖率超过95%。结论:提出的管道是确保俄语自由文本病历语义互操作性的一个步骤,可以有效地用于进一步的数据交换和集成的标准化系统。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Methods of Information in Medicine
Methods of Information in Medicine 医学-计算机:信息系统
CiteScore
3.70
自引率
11.80%
发文量
33
审稿时长
6-12 weeks
期刊介绍: Good medicine and good healthcare demand good information. Since the journal''s founding in 1962, Methods of Information in Medicine has stressed the methodology and scientific fundamentals of organizing, representing and analyzing data, information and knowledge in biomedicine and health care. Covering publications in the fields of biomedical and health informatics, medical biometry, and epidemiology, the journal publishes original papers, reviews, reports, opinion papers, editorials, and letters to the editor. From time to time, the journal publishes articles on particular focus themes as part of a journal''s issue.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信