化学实体规范化促进阿尔茨海默病和痴呆症治疗药物的成功转化开发。

IF 1.6 3区 工程技术 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY
Sarah Mullin, Robert McDougal, Kei-Hoi Cheung, Halil Kilicoglu, Amanda Beck, Caroline J Zeiss
{"title":"化学实体规范化促进阿尔茨海默病和痴呆症治疗药物的成功转化开发。","authors":"Sarah Mullin, Robert McDougal, Kei-Hoi Cheung, Halil Kilicoglu, Amanda Beck, Caroline J Zeiss","doi":"10.1186/s13326-024-00314-1","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Identifying chemical mentions within the Alzheimer's and dementia literature can provide a powerful tool to further therapeutic research. Leveraging the Chemical Entities of Biological Interest (ChEBI) ontology, which is rich in hierarchical and other relationship types, for entity normalization can provide an advantage for future downstream applications. We provide a reproducible hybrid approach that combines an ontology-enhanced PubMedBERT model for disambiguation with a dictionary-based method for candidate selection.</p><p><strong>Results: </strong>There were 56,553 chemical mentions in the titles of 44,812 unique PubMed article abstracts. Based on our gold standard, our method of disambiguation improved entity normalization by 25.3 percentage points compared to using only the dictionary-based approach with fuzzy-string matching for disambiguation. For the CRAFT corpus, our method outperformed baselines (maximum 78.4%) with a 91.17% accuracy. For our Alzheimer's and dementia cohort, we were able to add 47.1% more potential mappings between MeSH and ChEBI when compared to BioPortal.</p><p><strong>Conclusion: </strong>Use of natural language models like PubMedBERT and resources such as ChEBI and PubChem provide a beneficial way to link entity mentions to ontology terms, while further supporting downstream tasks like filtering ChEBI mentions based on roles and assertions to find beneficial therapies for Alzheimer's and dementia.</p>","PeriodicalId":15055,"journal":{"name":"Journal of Biomedical Semantics","volume":"15 1","pages":"13"},"PeriodicalIF":1.6000,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11290083/pdf/","citationCount":"0","resultStr":"{\"title\":\"Chemical entity normalization for successful translational development of Alzheimer's disease and dementia therapeutics.\",\"authors\":\"Sarah Mullin, Robert McDougal, Kei-Hoi Cheung, Halil Kilicoglu, Amanda Beck, Caroline J Zeiss\",\"doi\":\"10.1186/s13326-024-00314-1\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Identifying chemical mentions within the Alzheimer's and dementia literature can provide a powerful tool to further therapeutic research. Leveraging the Chemical Entities of Biological Interest (ChEBI) ontology, which is rich in hierarchical and other relationship types, for entity normalization can provide an advantage for future downstream applications. We provide a reproducible hybrid approach that combines an ontology-enhanced PubMedBERT model for disambiguation with a dictionary-based method for candidate selection.</p><p><strong>Results: </strong>There were 56,553 chemical mentions in the titles of 44,812 unique PubMed article abstracts. Based on our gold standard, our method of disambiguation improved entity normalization by 25.3 percentage points compared to using only the dictionary-based approach with fuzzy-string matching for disambiguation. For the CRAFT corpus, our method outperformed baselines (maximum 78.4%) with a 91.17% accuracy. For our Alzheimer's and dementia cohort, we were able to add 47.1% more potential mappings between MeSH and ChEBI when compared to BioPortal.</p><p><strong>Conclusion: </strong>Use of natural language models like PubMedBERT and resources such as ChEBI and PubChem provide a beneficial way to link entity mentions to ontology terms, while further supporting downstream tasks like filtering ChEBI mentions based on roles and assertions to find beneficial therapies for Alzheimer's and dementia.</p>\",\"PeriodicalId\":15055,\"journal\":{\"name\":\"Journal of Biomedical Semantics\",\"volume\":\"15 1\",\"pages\":\"13\"},\"PeriodicalIF\":1.6000,\"publicationDate\":\"2024-07-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11290083/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Biomedical Semantics\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1186/s13326-024-00314-1\",\"RegionNum\":3,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"MATHEMATICAL & COMPUTATIONAL BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Biomedical Semantics","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1186/s13326-024-00314-1","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

背景:识别阿尔茨海默氏症和痴呆症文献中提到的化学物质可以为进一步的治疗研究提供强有力的工具。生物兴趣化学实体(ChEBI)本体具有丰富的层次关系和其他关系类型,利用该本体进行实体规范化可为未来的下游应用提供优势。我们提供了一种可重复的混合方法,它将本体增强的 PubMedBERT 模型与基于词典的候选选择方法结合起来进行消歧:结果:44,812 篇 PubMed 文章摘要的标题中提到了 56,553 种化学物质。根据我们的黄金标准,与仅使用基于词典的模糊字符串匹配方法进行消歧相比,我们的消歧方法将实体规范化提高了 25.3 个百分点。在 CRAFT 语料库中,我们的方法以 91.17% 的准确率超过了基线(最高 78.4%)。对于我们的阿尔茨海默氏症和痴呆症队列,与 BioPortal 相比,我们能够在 MeSH 和 ChEBI 之间增加 47.1% 的潜在映射:结论:使用 PubMedBERT 等自然语言模型以及 ChEBI 和 PubChem 等资源,可以有效地将实体提及链接到本体术语,同时进一步支持下游任务,如根据角色和断言过滤 ChEBI 提及,从而找到治疗阿尔茨海默氏症和痴呆症的有效疗法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Chemical entity normalization for successful translational development of Alzheimer's disease and dementia therapeutics.

Background: Identifying chemical mentions within the Alzheimer's and dementia literature can provide a powerful tool to further therapeutic research. Leveraging the Chemical Entities of Biological Interest (ChEBI) ontology, which is rich in hierarchical and other relationship types, for entity normalization can provide an advantage for future downstream applications. We provide a reproducible hybrid approach that combines an ontology-enhanced PubMedBERT model for disambiguation with a dictionary-based method for candidate selection.

Results: There were 56,553 chemical mentions in the titles of 44,812 unique PubMed article abstracts. Based on our gold standard, our method of disambiguation improved entity normalization by 25.3 percentage points compared to using only the dictionary-based approach with fuzzy-string matching for disambiguation. For the CRAFT corpus, our method outperformed baselines (maximum 78.4%) with a 91.17% accuracy. For our Alzheimer's and dementia cohort, we were able to add 47.1% more potential mappings between MeSH and ChEBI when compared to BioPortal.

Conclusion: Use of natural language models like PubMedBERT and resources such as ChEBI and PubChem provide a beneficial way to link entity mentions to ontology terms, while further supporting downstream tasks like filtering ChEBI mentions based on roles and assertions to find beneficial therapies for Alzheimer's and dementia.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Biomedical Semantics
Journal of Biomedical Semantics MATHEMATICAL & COMPUTATIONAL BIOLOGY-
CiteScore
4.20
自引率
5.30%
发文量
28
审稿时长
30 weeks
期刊介绍: Journal of Biomedical Semantics addresses issues of semantic enrichment and semantic processing in the biomedical domain. The scope of the journal covers two main areas: Infrastructure for biomedical semantics: focusing on semantic resources and repositories, meta-data management and resource description, knowledge representation and semantic frameworks, the Biomedical Semantic Web, and semantic interoperability. Semantic mining, annotation, and analysis: focusing on approaches and applications of semantic resources; and tools for investigation, reasoning, prediction, and discoveries in biomedicine.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信