生物医学文献中命名实体共现的资源描述框架(RDF)模型及其与PubChemRDF的集成

IF 5.7 2区 化学 Q1 CHEMISTRY, MULTIDISCIPLINARY
Qingliang Li, Sunghwan Kim, Leonid Zaslavsky, Tiejun Cheng, Bo Yu, Evan E. Bolton
{"title":"生物医学文献中命名实体共现的资源描述框架(RDF)模型及其与PubChemRDF的集成","authors":"Qingliang Li,&nbsp;Sunghwan Kim,&nbsp;Leonid Zaslavsky,&nbsp;Tiejun Cheng,&nbsp;Bo Yu,&nbsp;Evan E. Bolton","doi":"10.1186/s13321-025-01017-0","DOIUrl":null,"url":null,"abstract":"<div><p>Named entities, such as chemicals/drugs, genes/proteins, and diseases, and their associations are not only important components of biomedical literature, but also the foundation of creating biomedical knowledgebases and knowledge graphs. This work addresses the challenges of expressing co-occurrence associations between named entities extracted from a biomedical literature corpus in a machine-readable format. We developed a Resource Description Framework (RDF) data model and integrated it into the PubChemRDF resource, which is freely accessible and publicly available. The developed co-occurrence data model was populated into a triplestore with named entities and their associations derived from text mining of millions of biomedical references found in PubMed. The utility of the data model was demonstrated through multiple use cases. Together with meta-data modeling of the references including the information about the author, journal, grant, and funding agency, this data model allows researchers to address pertinent biomedical questions through SPARQL queries and helps to exploit biomedical knowledge in various user perspectives and use cases.</p></div>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"17 1","pages":""},"PeriodicalIF":5.7000,"publicationDate":"2025-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-025-01017-0","citationCount":"0","resultStr":"{\"title\":\"A resource description framework (RDF) model of named entity co-occurrences in biomedical literature and its integration with PubChemRDF\",\"authors\":\"Qingliang Li,&nbsp;Sunghwan Kim,&nbsp;Leonid Zaslavsky,&nbsp;Tiejun Cheng,&nbsp;Bo Yu,&nbsp;Evan E. Bolton\",\"doi\":\"10.1186/s13321-025-01017-0\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Named entities, such as chemicals/drugs, genes/proteins, and diseases, and their associations are not only important components of biomedical literature, but also the foundation of creating biomedical knowledgebases and knowledge graphs. This work addresses the challenges of expressing co-occurrence associations between named entities extracted from a biomedical literature corpus in a machine-readable format. We developed a Resource Description Framework (RDF) data model and integrated it into the PubChemRDF resource, which is freely accessible and publicly available. The developed co-occurrence data model was populated into a triplestore with named entities and their associations derived from text mining of millions of biomedical references found in PubMed. The utility of the data model was demonstrated through multiple use cases. Together with meta-data modeling of the references including the information about the author, journal, grant, and funding agency, this data model allows researchers to address pertinent biomedical questions through SPARQL queries and helps to exploit biomedical knowledge in various user perspectives and use cases.</p></div>\",\"PeriodicalId\":617,\"journal\":{\"name\":\"Journal of Cheminformatics\",\"volume\":\"17 1\",\"pages\":\"\"},\"PeriodicalIF\":5.7000,\"publicationDate\":\"2025-05-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-025-01017-0\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Cheminformatics\",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://link.springer.com/article/10.1186/s13321-025-01017-0\",\"RegionNum\":2,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CHEMISTRY, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Cheminformatics","FirstCategoryId":"92","ListUrlMain":"https://link.springer.com/article/10.1186/s13321-025-01017-0","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

摘要

命名实体(如化学物质/药物、基因/蛋白质和疾病)及其关联不仅是生物医学文献的重要组成部分,也是创建生物医学知识库和知识图谱的基础。这项工作解决了以机器可读格式从生物医学文献语料库中提取的命名实体之间表达共现关联的挑战。我们开发了一个资源描述框架(Resource Description Framework, RDF)数据模型,并将其集成到PubChemRDF资源中,该资源可以免费访问并公开可用。开发的共现数据模型被填充到一个triplestore中,其中包含命名实体及其关联,这些实体来自PubMed中发现的数百万个生物医学参考文献的文本挖掘。通过多个用例演示了数据模型的实用性。与参考文献的元数据建模(包括作者、期刊、授权和资助机构的信息)一起,该数据模型允许研究人员通过SPARQL查询解决相关的生物医学问题,并有助于从不同的用户角度和用例中利用生物医学知识。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A resource description framework (RDF) model of named entity co-occurrences in biomedical literature and its integration with PubChemRDF

Named entities, such as chemicals/drugs, genes/proteins, and diseases, and their associations are not only important components of biomedical literature, but also the foundation of creating biomedical knowledgebases and knowledge graphs. This work addresses the challenges of expressing co-occurrence associations between named entities extracted from a biomedical literature corpus in a machine-readable format. We developed a Resource Description Framework (RDF) data model and integrated it into the PubChemRDF resource, which is freely accessible and publicly available. The developed co-occurrence data model was populated into a triplestore with named entities and their associations derived from text mining of millions of biomedical references found in PubMed. The utility of the data model was demonstrated through multiple use cases. Together with meta-data modeling of the references including the information about the author, journal, grant, and funding agency, this data model allows researchers to address pertinent biomedical questions through SPARQL queries and helps to exploit biomedical knowledge in various user perspectives and use cases.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Cheminformatics
Journal of Cheminformatics CHEMISTRY, MULTIDISCIPLINARY-COMPUTER SCIENCE, INFORMATION SYSTEMS
CiteScore
14.10
自引率
7.00%
发文量
82
审稿时长
3 months
期刊介绍: Journal of Cheminformatics is an open access journal publishing original peer-reviewed research in all aspects of cheminformatics and molecular modelling. Coverage includes, but is not limited to: chemical information systems, software and databases, and molecular modelling, chemical structure representations and their use in structure, substructure, and similarity searching of chemical substance and chemical reaction databases, computer and molecular graphics, computer-aided molecular design, expert systems, QSAR, and data mining techniques.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信