Europe's Largest Research Infrastructure for Curated Medical Data Models with Semantic Annotations.

IF 1.3 4区 医学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS
Methods of Information in Medicine Pub Date : 2024-05-01 Epub Date: 2024-05-13 DOI:10.1055/s-0044-1786839
Sarah Riepenhausen, Max Blumenstock, Christian Niklas, Stefan Hegselmann, Philipp Neuhaus, Alexandra Meidt, Cornelia Püttmann, Michael Storck, Matthias Ganzinger, Julian Varghese, Martin Dugas
{"title":"Europe's Largest Research Infrastructure for Curated Medical Data Models with Semantic Annotations.","authors":"Sarah Riepenhausen, Max Blumenstock, Christian Niklas, Stefan Hegselmann, Philipp Neuhaus, Alexandra Meidt, Cornelia Püttmann, Michael Storck, Matthias Ganzinger, Julian Varghese, Martin Dugas","doi":"10.1055/s-0044-1786839","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Structural metadata from the majority of clinical studies and routine health care systems is currently not yet available to the scientific community.</p><p><strong>Objective: </strong>To provide an overview of available contents in the Portal of Medical Data Models (MDM Portal).</p><p><strong>Methods: </strong>The MDM Portal is a registered European information infrastructure for research and health care, and its contents are curated and semantically annotated by medical experts. It enables users to search, view, discuss, and download existing medical data models.</p><p><strong>Results: </strong>The most frequent keyword is \"clinical trial\" (<i>n</i> = 18,777), and the most frequent disease-specific keyword is \"breast neoplasms\" (<i>n</i> = 1,943). Most data items are available in English (<i>n</i> = 545,749) and German (<i>n</i> = 109,267). Manually curated semantic annotations are available for 805,308 elements (554,352 items, 58,101 item groups, and 192,855 code list items), which were derived from 25,257 data models. In total, 1,609,225 Unified Medical Language System (UMLS) codes have been assigned, with 66,373 unique UMLS codes.</p><p><strong>Conclusion: </strong>To our knowledge, the MDM Portal constitutes Europe's largest collection of medical data models with semantically annotated elements. As such, it can be used to increase compatibility of medical datasets and can be utilized as a large expert-annotated medical text corpus for natural language processing.</p>","PeriodicalId":49822,"journal":{"name":"Methods of Information in Medicine","volume":" ","pages":"52-61"},"PeriodicalIF":1.3000,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11495939/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Methods of Information in Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1055/s-0044-1786839","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/5/13 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Structural metadata from the majority of clinical studies and routine health care systems is currently not yet available to the scientific community.

Objective: To provide an overview of available contents in the Portal of Medical Data Models (MDM Portal).

Methods: The MDM Portal is a registered European information infrastructure for research and health care, and its contents are curated and semantically annotated by medical experts. It enables users to search, view, discuss, and download existing medical data models.

Results: The most frequent keyword is "clinical trial" (n = 18,777), and the most frequent disease-specific keyword is "breast neoplasms" (n = 1,943). Most data items are available in English (n = 545,749) and German (n = 109,267). Manually curated semantic annotations are available for 805,308 elements (554,352 items, 58,101 item groups, and 192,855 code list items), which were derived from 25,257 data models. In total, 1,609,225 Unified Medical Language System (UMLS) codes have been assigned, with 66,373 unique UMLS codes.

Conclusion: To our knowledge, the MDM Portal constitutes Europe's largest collection of medical data models with semantically annotated elements. As such, it can be used to increase compatibility of medical datasets and can be utilized as a large expert-annotated medical text corpus for natural language processing.

欧洲最大的带语义注释的医学数据模型研究基础设施。
背景:大多数临床研究和常规医疗保健系统的结构元数据目前尚未向科学界开放:概述医学数据模型门户网站(MDM Portal)的可用内容:医学数据模型门户网站是欧洲注册的研究与医疗保健信息基础设施,其内容由医学专家策划并进行语义注释。用户可以通过它搜索、查看、讨论和下载现有的医学数据模型:最常见的关键词是 "临床试验"(n = 18,777),最常见的特定疾病关键词是 "乳腺肿瘤"(n = 1,943)。大多数数据项以英语(n = 545,749 个)和德语(n = 109,267 个)提供。805,308 个元素(554,352 个条目、58,101 个条目组和 192,855 个代码表条目)的语义注释由人工编辑,这些注释来自 25,257 个数据模型。总共分配了 1,609,225 个统一医学语言系统(UMLS)代码,其中有 66,373 个独特的 UMLS 代码:据我们所知,MDM 门户网站是欧洲最大的带有语义注释元素的医学数据模型集合。因此,该门户网站可用于提高医疗数据集的兼容性,并可作为大型专家注释医疗文本语料库用于自然语言处理。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Methods of Information in Medicine
Methods of Information in Medicine 医学-计算机:信息系统
CiteScore
3.70
自引率
11.80%
发文量
33
审稿时长
6-12 weeks
期刊介绍: Good medicine and good healthcare demand good information. Since the journal''s founding in 1962, Methods of Information in Medicine has stressed the methodology and scientific fundamentals of organizing, representing and analyzing data, information and knowledge in biomedicine and health care. Covering publications in the fields of biomedical and health informatics, medical biometry, and epidemiology, the journal publishes original papers, reviews, reports, opinion papers, editorials, and letters to the editor. From time to time, the journal publishes articles on particular focus themes as part of a journal''s issue.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信