随机对照试验采用的信息提取方法的范围综述。

Q2 Medicine
Medical Journal of the Islamic Republic of Iran Pub Date : 2023-09-04 eCollection Date: 2023-01-01 DOI:10.47176/mjiri.37.95
Azadeh Aletaha, Leila Nemati-Anaraki, AbbasAli Keshtkar, Shahram Sedghi, Abdalsamad Keramatfar, Anna Korolyova
{"title":"随机对照试验采用的信息提取方法的范围综述。","authors":"Azadeh Aletaha, Leila Nemati-Anaraki, AbbasAli Keshtkar, Shahram Sedghi, Abdalsamad Keramatfar, Anna Korolyova","doi":"10.47176/mjiri.37.95","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Randomized controlled trials (RCTs) provide the strongest evidence for therapeutic interventions and their effects on groups of subjects. However, the large amount of unstructured information in these trials makes it challenging and time-consuming to make decisions and identify important concepts and valid evidence. This study aims to explore methods for automating or semi-automating information extraction from reports of RCT studies.</p><p><strong>Methods: </strong>We conducted a systematic search of PubMed, ACM Digital Library, and Web of Science to identify relevant articles published between January 1, 2010, and 2022. We focused on published Natural Language Processing (NLP), machine learning, and deep learning methods that automate or semi-automate key elements of information extraction in the context of RCTs.</p><p><strong>Results: </strong>A total of 26 publications were included, which discussed the automatic extraction of key characteristics of RCTs using various PICO frameworks (PIBOSO and PECODR). Among these publications, 14 (53.8%) extracted key characteristics based on PICO, PIBOSO, and PECODR, while 12 (46.1%) discussed information extraction methods in RCT studies. Common approaches mentioned included word/phrase matching, machine learning algorithms such as binary classification using the Naïve Bayes algorithm and powerful BERT network for feature extraction, support vector machine for data classification, conditional random field, non-machine-dependent automation, and machine learning or deep learning approaches.</p><p><strong>Conclusion: </strong>The lack of publicly available software and limited access to existing software makes it difficult to determine the most powerful information extraction system. However, deep learning models like Transformers and BERT language models have shown better performance in natural language processing.</p>","PeriodicalId":18361,"journal":{"name":"Medical Journal of the Islamic Republic of Iran","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10657257/pdf/","citationCount":"0","resultStr":"{\"title\":\"A Scoping Review of Adopted Information Extraction Methods for RCTs.\",\"authors\":\"Azadeh Aletaha, Leila Nemati-Anaraki, AbbasAli Keshtkar, Shahram Sedghi, Abdalsamad Keramatfar, Anna Korolyova\",\"doi\":\"10.47176/mjiri.37.95\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Randomized controlled trials (RCTs) provide the strongest evidence for therapeutic interventions and their effects on groups of subjects. However, the large amount of unstructured information in these trials makes it challenging and time-consuming to make decisions and identify important concepts and valid evidence. This study aims to explore methods for automating or semi-automating information extraction from reports of RCT studies.</p><p><strong>Methods: </strong>We conducted a systematic search of PubMed, ACM Digital Library, and Web of Science to identify relevant articles published between January 1, 2010, and 2022. We focused on published Natural Language Processing (NLP), machine learning, and deep learning methods that automate or semi-automate key elements of information extraction in the context of RCTs.</p><p><strong>Results: </strong>A total of 26 publications were included, which discussed the automatic extraction of key characteristics of RCTs using various PICO frameworks (PIBOSO and PECODR). Among these publications, 14 (53.8%) extracted key characteristics based on PICO, PIBOSO, and PECODR, while 12 (46.1%) discussed information extraction methods in RCT studies. Common approaches mentioned included word/phrase matching, machine learning algorithms such as binary classification using the Naïve Bayes algorithm and powerful BERT network for feature extraction, support vector machine for data classification, conditional random field, non-machine-dependent automation, and machine learning or deep learning approaches.</p><p><strong>Conclusion: </strong>The lack of publicly available software and limited access to existing software makes it difficult to determine the most powerful information extraction system. However, deep learning models like Transformers and BERT language models have shown better performance in natural language processing.</p>\",\"PeriodicalId\":18361,\"journal\":{\"name\":\"Medical Journal of the Islamic Republic of Iran\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-09-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10657257/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Medical Journal of the Islamic Republic of Iran\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.47176/mjiri.37.95\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2023/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q2\",\"JCRName\":\"Medicine\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical Journal of the Islamic Republic of Iran","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.47176/mjiri.37.95","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"Medicine","Score":null,"Total":0}
引用次数: 0

摘要

背景:随机对照试验(RCTs)为治疗干预及其对受试者群体的影响提供了最有力的证据。然而,这些试验中大量的非结构化信息使得决策和识别重要概念和有效证据变得具有挑战性和耗时。本研究旨在探索从RCT研究报告中自动化或半自动化信息提取的方法。方法:我们对PubMed、ACM数字图书馆和Web of Science进行了系统检索,以确定2010年1月1日至2022年期间发表的相关文章。我们专注于已发表的自然语言处理(NLP)、机器学习和深度学习方法,这些方法可以在随机对照试验的背景下自动化或半自动化信息提取的关键要素。结果:共纳入26篇文献,讨论了使用各种PICO框架(PIBOSO和PECODR)自动提取rct关键特征。其中14篇(53.8%)基于PICO、PIBOSO和PECODR提取关键特征,12篇(46.1%)讨论了RCT研究中的信息提取方法。提到的常用方法包括词/短语匹配、机器学习算法(如使用Naïve贝叶斯算法和强大的BERT网络进行特征提取的二进制分类)、数据分类的支持向量机、条件随机场、非机器依赖的自动化以及机器学习或深度学习方法。结论:由于缺乏公开可用的软件和对现有软件的有限访问,难以确定最强大的信息提取系统。然而,像变形金刚和BERT语言模型这样的深度学习模型在自然语言处理中表现出更好的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A Scoping Review of Adopted Information Extraction Methods for RCTs.

Background: Randomized controlled trials (RCTs) provide the strongest evidence for therapeutic interventions and their effects on groups of subjects. However, the large amount of unstructured information in these trials makes it challenging and time-consuming to make decisions and identify important concepts and valid evidence. This study aims to explore methods for automating or semi-automating information extraction from reports of RCT studies.

Methods: We conducted a systematic search of PubMed, ACM Digital Library, and Web of Science to identify relevant articles published between January 1, 2010, and 2022. We focused on published Natural Language Processing (NLP), machine learning, and deep learning methods that automate or semi-automate key elements of information extraction in the context of RCTs.

Results: A total of 26 publications were included, which discussed the automatic extraction of key characteristics of RCTs using various PICO frameworks (PIBOSO and PECODR). Among these publications, 14 (53.8%) extracted key characteristics based on PICO, PIBOSO, and PECODR, while 12 (46.1%) discussed information extraction methods in RCT studies. Common approaches mentioned included word/phrase matching, machine learning algorithms such as binary classification using the Naïve Bayes algorithm and powerful BERT network for feature extraction, support vector machine for data classification, conditional random field, non-machine-dependent automation, and machine learning or deep learning approaches.

Conclusion: The lack of publicly available software and limited access to existing software makes it difficult to determine the most powerful information extraction system. However, deep learning models like Transformers and BERT language models have shown better performance in natural language processing.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
2.40
自引率
0.00%
发文量
90
审稿时长
8 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信