概念图:医学文献文本分析的新方法。

Q3 Health Professions
Franz Matthies, Christoph Beger, Ralph Schäfermeier, Alexandr Uciteli
{"title":"概念图:医学文献文本分析的新方法。","authors":"Franz Matthies,&nbsp;Christoph Beger,&nbsp;Ralph Schäfermeier,&nbsp;Alexandr Uciteli","doi":"10.3233/SHTI230710","DOIUrl":null,"url":null,"abstract":"<p><p>The task of automatically analyzing the textual content of documents faces a number of challenges in general but even more so when dealing with the medical domain. Here, we can't normally rely on specifically pre-trained NLP models or even, due to data privacy reasons, (massive) amounts of training material to generate said models. We, therefore, propose a method that utilizes general-purpose basic text analysis components and state-of-the-art transformer models to represent a corpus of documents as multiple graphs, wherein important conceptually related phrases from documents constitute the nodes and their semantic relation form the edges. This method could serve as a basis for several explorative procedures and is able to draw on a plethora of publicly available resources. We test it by comparing the effectiveness of these so-called Concept Graphs with another recently suggested approach for a common use case in information retrieval, document clustering.</p>","PeriodicalId":39242,"journal":{"name":"Studies in Health Technology and Informatics","volume":"307 ","pages":"172-179"},"PeriodicalIF":0.0000,"publicationDate":"2023-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Concept Graphs: A Novel Approach for Textual Analysis of Medical Documents.\",\"authors\":\"Franz Matthies,&nbsp;Christoph Beger,&nbsp;Ralph Schäfermeier,&nbsp;Alexandr Uciteli\",\"doi\":\"10.3233/SHTI230710\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>The task of automatically analyzing the textual content of documents faces a number of challenges in general but even more so when dealing with the medical domain. Here, we can't normally rely on specifically pre-trained NLP models or even, due to data privacy reasons, (massive) amounts of training material to generate said models. We, therefore, propose a method that utilizes general-purpose basic text analysis components and state-of-the-art transformer models to represent a corpus of documents as multiple graphs, wherein important conceptually related phrases from documents constitute the nodes and their semantic relation form the edges. This method could serve as a basis for several explorative procedures and is able to draw on a plethora of publicly available resources. We test it by comparing the effectiveness of these so-called Concept Graphs with another recently suggested approach for a common use case in information retrieval, document clustering.</p>\",\"PeriodicalId\":39242,\"journal\":{\"name\":\"Studies in Health Technology and Informatics\",\"volume\":\"307 \",\"pages\":\"172-179\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-09-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Studies in Health Technology and Informatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3233/SHTI230710\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Health Professions\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Studies in Health Technology and Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3233/SHTI230710","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Health Professions","Score":null,"Total":0}
引用次数: 0

摘要

自动分析文档文本内容的任务通常面临许多挑战,但在处理医学领域时更是如此。在这里,我们通常不能依靠专门预训练的NLP模型,甚至由于数据隐私原因,(大量)训练材料来生成所述模型。因此,我们提出了一种方法,利用通用的基本文本分析组件和最先进的转换器模型将文档语料库表示为多个图,其中文档中重要的概念相关短语构成节点,它们的语义关系构成边缘。这种方法可以作为若干探索性程序的基础,并能够利用大量的公共资源。我们通过比较这些所谓的概念图和最近提出的另一种方法的有效性来测试它,这种方法用于信息检索中的一个常见用例——文档聚类。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Concept Graphs: A Novel Approach for Textual Analysis of Medical Documents.

The task of automatically analyzing the textual content of documents faces a number of challenges in general but even more so when dealing with the medical domain. Here, we can't normally rely on specifically pre-trained NLP models or even, due to data privacy reasons, (massive) amounts of training material to generate said models. We, therefore, propose a method that utilizes general-purpose basic text analysis components and state-of-the-art transformer models to represent a corpus of documents as multiple graphs, wherein important conceptually related phrases from documents constitute the nodes and their semantic relation form the edges. This method could serve as a basis for several explorative procedures and is able to draw on a plethora of publicly available resources. We test it by comparing the effectiveness of these so-called Concept Graphs with another recently suggested approach for a common use case in information retrieval, document clustering.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Studies in Health Technology and Informatics
Studies in Health Technology and Informatics Health Professions-Health Information Management
CiteScore
1.20
自引率
0.00%
发文量
1463
期刊介绍: This book series was started in 1990 to promote research conducted under the auspices of the EC programmes’ Advanced Informatics in Medicine (AIM) and Biomedical and Health Research (BHR) bioengineering branch. A driving aspect of international health informatics is that telecommunication technology, rehabilitative technology, intelligent home technology and many other components are moving together and form one integrated world of information and communication media.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信