从生物医学文本中提取出处元数据的本体支持自然语言处理管道(短篇论文)。

Joshua Valdez, Michael Rueschman, Matthew Kim, Susan Redline, Satya S Sahoo
{"title":"从生物医学文本中提取出处元数据的本体支持自然语言处理管道(短篇论文)。","authors":"Joshua Valdez, Michael Rueschman, Matthew Kim, Susan Redline, Satya S Sahoo","doi":"10.1007/978-3-319-48472-3_43","DOIUrl":null,"url":null,"abstract":"<p><p>Extraction of structured information from biomedical literature is a complex and challenging problem due to the complexity of biomedical domain and lack of appropriate natural language processing (NLP) techniques. High quality domain ontologies model both data and metadata information at a fine level of granularity, which can be effectively used to accurately extract structured information from biomedical text. Extraction of provenance metadata, which describes the history or source of information, from published articles is an important task to support scientific reproducibility. Reproducibility of results reported by previous research studies is a foundational component of scientific advancement. This is highlighted by the recent initiative by the US National Institutes of Health called \"Principles of Rigor and Reproducibility\". In this paper, we describe an effective approach to extract provenance metadata from published biomedical research literature using an ontology-enabled NLP platform as part of the Provenance for Clinical and Healthcare Research (ProvCaRe). The ProvCaRe-NLP tool extends the clinical Text Analysis and Knowledge Extraction System (cTAKES) platform using both provenance and biomedical domain ontologies. We demonstrate the effectiveness of ProvCaRe-NLP tool using a corpus of 20 peer-reviewed publications. The results of our evaluation demonstrate that the ProvCaRe-NLP tool has significantly higher recall in extracting provenance metadata as compared to existing NLP pipelines such as MetaMap.</p>","PeriodicalId":91936,"journal":{"name":"On the move to meaningful Internet systems ... : CoopIS, DOA, and ODBASE : Confederated International Conferences, CoopIS, DOA, and ODBASE ... proceedings. OTM Confederated International Conferences","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5486409/pdf/nihms861936.pdf","citationCount":"0","resultStr":"{\"title\":\"An Ontology-Enabled Natural Language Processing Pipeline for Provenance Metadata Extraction from Biomedical Text (Short Paper).\",\"authors\":\"Joshua Valdez, Michael Rueschman, Matthew Kim, Susan Redline, Satya S Sahoo\",\"doi\":\"10.1007/978-3-319-48472-3_43\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Extraction of structured information from biomedical literature is a complex and challenging problem due to the complexity of biomedical domain and lack of appropriate natural language processing (NLP) techniques. High quality domain ontologies model both data and metadata information at a fine level of granularity, which can be effectively used to accurately extract structured information from biomedical text. Extraction of provenance metadata, which describes the history or source of information, from published articles is an important task to support scientific reproducibility. Reproducibility of results reported by previous research studies is a foundational component of scientific advancement. This is highlighted by the recent initiative by the US National Institutes of Health called \\\"Principles of Rigor and Reproducibility\\\". In this paper, we describe an effective approach to extract provenance metadata from published biomedical research literature using an ontology-enabled NLP platform as part of the Provenance for Clinical and Healthcare Research (ProvCaRe). The ProvCaRe-NLP tool extends the clinical Text Analysis and Knowledge Extraction System (cTAKES) platform using both provenance and biomedical domain ontologies. We demonstrate the effectiveness of ProvCaRe-NLP tool using a corpus of 20 peer-reviewed publications. The results of our evaluation demonstrate that the ProvCaRe-NLP tool has significantly higher recall in extracting provenance metadata as compared to existing NLP pipelines such as MetaMap.</p>\",\"PeriodicalId\":91936,\"journal\":{\"name\":\"On the move to meaningful Internet systems ... : CoopIS, DOA, and ODBASE : Confederated International Conferences, CoopIS, DOA, and ODBASE ... proceedings. OTM Confederated International Conferences\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5486409/pdf/nihms861936.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"On the move to meaningful Internet systems ... : CoopIS, DOA, and ODBASE : Confederated International Conferences, CoopIS, DOA, and ODBASE ... proceedings. OTM Confederated International Conferences\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1007/978-3-319-48472-3_43\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2016/10/18 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"On the move to meaningful Internet systems ... : CoopIS, DOA, and ODBASE : Confederated International Conferences, CoopIS, DOA, and ODBASE ... proceedings. OTM Confederated International Conferences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/978-3-319-48472-3_43","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2016/10/18 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

由于生物医学领域的复杂性和缺乏适当的自然语言处理(NLP)技术,从生物医学文献中提取结构化信息是一个复杂而具有挑战性的问题。高质量的领域本体可以对数据和元数据信息进行细粒度建模,可有效用于从生物医学文本中准确提取结构化信息。从发表的文章中提取描述信息历史或来源的出处元数据是支持科学可重复性的一项重要任务。以往研究报告结果的可重复性是科学进步的基础。美国国立卫生研究院最近提出的 "严谨性和可重复性原则 "倡议强调了这一点。在本文中,我们介绍了一种从已发表的生物医学研究文献中提取出处元数据的有效方法,该方法使用本体论支持的 NLP 平台,是临床和医疗保健研究出处(Provenance for Clinical and Healthcare Research,ProvCaRe)的一部分。ProvCaRe-NLP 工具使用出处和生物医学领域本体扩展了临床文本分析和知识提取系统(ctaKES)平台。我们使用由 20 篇同行评议出版物组成的语料库演示了 ProvCaRe-NLP 工具的有效性。评估结果表明,与 MetaMap 等现有 NLP 管道相比,ProvCaRe-NLP 工具在提取出处元数据方面的召回率要高得多。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

An Ontology-Enabled Natural Language Processing Pipeline for Provenance Metadata Extraction from Biomedical Text (Short Paper).

An Ontology-Enabled Natural Language Processing Pipeline for Provenance Metadata Extraction from Biomedical Text (Short Paper).

Extraction of structured information from biomedical literature is a complex and challenging problem due to the complexity of biomedical domain and lack of appropriate natural language processing (NLP) techniques. High quality domain ontologies model both data and metadata information at a fine level of granularity, which can be effectively used to accurately extract structured information from biomedical text. Extraction of provenance metadata, which describes the history or source of information, from published articles is an important task to support scientific reproducibility. Reproducibility of results reported by previous research studies is a foundational component of scientific advancement. This is highlighted by the recent initiative by the US National Institutes of Health called "Principles of Rigor and Reproducibility". In this paper, we describe an effective approach to extract provenance metadata from published biomedical research literature using an ontology-enabled NLP platform as part of the Provenance for Clinical and Healthcare Research (ProvCaRe). The ProvCaRe-NLP tool extends the clinical Text Analysis and Knowledge Extraction System (cTAKES) platform using both provenance and biomedical domain ontologies. We demonstrate the effectiveness of ProvCaRe-NLP tool using a corpus of 20 peer-reviewed publications. The results of our evaluation demonstrate that the ProvCaRe-NLP tool has significantly higher recall in extracting provenance metadata as compared to existing NLP pipelines such as MetaMap.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信