使用基于自然语言处理的高级提取系统实现免疫系统综述的自动化。

David Begert, Justin Granek, Brian Irwin, Chris Brogly
{"title":"使用基于自然语言处理的高级提取系统实现免疫系统综述的自动化。","authors":"David Begert, Justin Granek, Brian Irwin, Chris Brogly","doi":"10.14745/ccdr.v46i06a04","DOIUrl":null,"url":null,"abstract":"<p><p>Evidence-informed decision making is based on the premise that the entirety of information on a topic is collected and analyzed. Systematic reviews allow for data from different studies to be rigorously assessed according to PICO principles (population, intervention, control, outcomes). However, conducting a systematic review is generally a slow process that is a significant drain on resources. The fundamental problem is that the current approach to creating a systematic review cannot scale to meet the challenges resulting from the massive body of unstructured evidence. For this reason, the Public Health Agency of Canada has been examining the automation of different stages of evidence synthesis to increase efficiencies. In this article, we present an overview of an initial version of a novel machine learning-based system that is powered by recent advances in natural language processing (NLP), such as BioBERT, with further optimizations completed using a new immunization-specific document database. The resulting optimized NLP model at the core of this system is able to identify and extract PICO-related fields from publications on immunization with an average accuracy of 88% across five classes of text. Functionality is provided through a straightforward web interface.</p>","PeriodicalId":94304,"journal":{"name":"Canada communicable disease report = Releve des maladies transmissibles au Canada","volume":"46 6","pages":"174-179"},"PeriodicalIF":0.0000,"publicationDate":"2020-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11182649/pdf/","citationCount":"0","resultStr":"{\"title\":\"Towards automating systematic reviews on immunization using an advanced natural language processing-based extraction system.\",\"authors\":\"David Begert, Justin Granek, Brian Irwin, Chris Brogly\",\"doi\":\"10.14745/ccdr.v46i06a04\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Evidence-informed decision making is based on the premise that the entirety of information on a topic is collected and analyzed. Systematic reviews allow for data from different studies to be rigorously assessed according to PICO principles (population, intervention, control, outcomes). However, conducting a systematic review is generally a slow process that is a significant drain on resources. The fundamental problem is that the current approach to creating a systematic review cannot scale to meet the challenges resulting from the massive body of unstructured evidence. For this reason, the Public Health Agency of Canada has been examining the automation of different stages of evidence synthesis to increase efficiencies. In this article, we present an overview of an initial version of a novel machine learning-based system that is powered by recent advances in natural language processing (NLP), such as BioBERT, with further optimizations completed using a new immunization-specific document database. The resulting optimized NLP model at the core of this system is able to identify and extract PICO-related fields from publications on immunization with an average accuracy of 88% across five classes of text. Functionality is provided through a straightforward web interface.</p>\",\"PeriodicalId\":94304,\"journal\":{\"name\":\"Canada communicable disease report = Releve des maladies transmissibles au Canada\",\"volume\":\"46 6\",\"pages\":\"174-179\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-06-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11182649/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Canada communicable disease report = Releve des maladies transmissibles au Canada\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.14745/ccdr.v46i06a04\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Canada communicable disease report = Releve des maladies transmissibles au Canada","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.14745/ccdr.v46i06a04","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

循证决策的前提是收集和分析有关某一主题的全部信息。系统综述可以根据 PICO 原则(人群、干预、对照、结果)对不同研究的数据进行严格评估。然而,进行系统性回顾通常是一个缓慢的过程,需要耗费大量资源。根本问题在于,目前的系统性回顾方法无法应对大量非结构化证据带来的挑战。因此,加拿大公共卫生局一直在研究证据合成不同阶段的自动化,以提高效率。在本文中,我们概述了基于机器学习的新型系统的初始版本,该系统由自然语言处理(NLP)领域的最新进展(如 BioBERT)提供支持,并利用新的免疫特定文档数据库完成了进一步优化。经过优化的 NLP 模型是该系统的核心,它能够识别和提取免疫出版物中与 PICO 相关的字段,在五类文本中的平均准确率为 88%。功能通过一个简单的网络界面提供。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Towards automating systematic reviews on immunization using an advanced natural language processing-based extraction system.

Evidence-informed decision making is based on the premise that the entirety of information on a topic is collected and analyzed. Systematic reviews allow for data from different studies to be rigorously assessed according to PICO principles (population, intervention, control, outcomes). However, conducting a systematic review is generally a slow process that is a significant drain on resources. The fundamental problem is that the current approach to creating a systematic review cannot scale to meet the challenges resulting from the massive body of unstructured evidence. For this reason, the Public Health Agency of Canada has been examining the automation of different stages of evidence synthesis to increase efficiencies. In this article, we present an overview of an initial version of a novel machine learning-based system that is powered by recent advances in natural language processing (NLP), such as BioBERT, with further optimizations completed using a new immunization-specific document database. The resulting optimized NLP model at the core of this system is able to identify and extract PICO-related fields from publications on immunization with an average accuracy of 88% across five classes of text. Functionality is provided through a straightforward web interface.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信