A comparison of machine learning methods to find clinical trials for inclusion in new systematic reviews from their PROSPERO registrations prior to searching and screening

IF 6.1 2区生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Research Synthesis Methods Pub Date : 2023-09-25 DOI:10.1002/jrsm.1672

Shifeng Liu, Florence T. Bourgeois, Claire Narang, Adam G. Dunn

{"title":"A comparison of machine learning methods to find clinical trials for inclusion in new systematic reviews from their PROSPERO registrations prior to searching and screening","authors":"Shifeng Liu, Florence T. Bourgeois, Claire Narang, Adam G. Dunn","doi":"10.1002/jrsm.1672","DOIUrl":null,"url":null,"abstract":"<p>Searching for trials is a key task in systematic reviews and a focus of automation. Previous approaches required knowing examples of relevant trials in advance, and most methods are focused on published trial articles. To complement existing tools, we compared methods for finding relevant trial registrations given a International Prospective Register of Systematic Reviews (PROSPERO) entry and where no relevant trials have been screened for inclusion in advance. We compared SciBERT-based (extension of Bidirectional Encoder Representations from Transformers) PICO extraction, MetaMap, and term-based representations using an imperfect dataset mined from 3632 PROSPERO entries connected to a subset of 65,662 trial registrations and 65,834 trial articles known to be included in systematic reviews. Performance was measured by the median rank and recall by rank of trials that were eventually included in the published systematic reviews. When ranking trial registrations relative to PROSPERO entries, 296 trial registrations needed to be screened to identify half of the relevant trials, and the best performing approach used a basic term-based representation. When ranking trial articles relative to PROSPERO entries, 162 trial articles needed to be screened to identify half of the relevant trials, and the best-performing approach used a term-based representation. The results show that MetaMap and term-based representations outperformed approaches that included PICO extraction for this use case. The results suggest that when starting with a PROSPERO entry and where no trials have been screened for inclusion, automated methods can reduce workload, but additional processes are still needed to efficiently identify trial registrations or trial articles that meet the inclusion criteria of a systematic review.</p>","PeriodicalId":226,"journal":{"name":"Research Synthesis Methods","volume":"15 1","pages":"73-85"},"PeriodicalIF":6.1000,"publicationDate":"2023-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/jrsm.1672","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Research Synthesis Methods","FirstCategoryId":"99","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/jrsm.1672","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Searching for trials is a key task in systematic reviews and a focus of automation. Previous approaches required knowing examples of relevant trials in advance, and most methods are focused on published trial articles. To complement existing tools, we compared methods for finding relevant trial registrations given a International Prospective Register of Systematic Reviews (PROSPERO) entry and where no relevant trials have been screened for inclusion in advance. We compared SciBERT-based (extension of Bidirectional Encoder Representations from Transformers) PICO extraction, MetaMap, and term-based representations using an imperfect dataset mined from 3632 PROSPERO entries connected to a subset of 65,662 trial registrations and 65,834 trial articles known to be included in systematic reviews. Performance was measured by the median rank and recall by rank of trials that were eventually included in the published systematic reviews. When ranking trial registrations relative to PROSPERO entries, 296 trial registrations needed to be screened to identify half of the relevant trials, and the best performing approach used a basic term-based representation. When ranking trial articles relative to PROSPERO entries, 162 trial articles needed to be screened to identify half of the relevant trials, and the best-performing approach used a term-based representation. The results show that MetaMap and term-based representations outperformed approaches that included PICO extraction for this use case. The results suggest that when starting with a PROSPERO entry and where no trials have been screened for inclusion, automated methods can reduce workload, but additional processes are still needed to efficiently identify trial registrations or trial articles that meet the inclusion criteria of a systematic review.

Abstract Image

查看原文本刊更多论文

机器学习方法的比较，以在搜索和筛选之前，从其PROSPERO注册中找到可纳入新系统综述的临床试验。

检索试验是系统综述中的一项关键任务，也是自动化的一个重点。以前的方法需要提前了解相关试验的例子，大多数方法都集中在已发表的试验文章上。为了补充现有的工具，我们比较了在国际前瞻性系统评价登记（PROSPERO）条目中寻找相关试验注册的方法，以及没有预先筛选相关试验的方法。我们比较了基于SciBERT（Transformers双向编码器表示的扩展）的PICO提取、MetaMap和基于术语的表示，使用了从3632个PROSPERO条目中挖掘的不完美数据集，这些条目与65662个试验注册和65834篇已知包含在系统综述中的试验文章的子集相关联。绩效是通过试验的中位数等级和召回率等级来衡量的，这些试验最终被纳入已发表的系统综述中。在根据PROSPERO条目对试验注册进行排名时，需要对296个试验注册进行筛选，以确定一半的相关试验，而表现最好的方法使用了基于术语的基本表示。在根据PROSPERO条目对试验文章进行排名时，需要对162篇试验文章进行筛选，以确定一半的相关试验，而表现最好的方法使用基于术语的表示。结果表明，MetaMap和基于术语的表示优于该用例中包括PICO提取的方法。研究结果表明，当从PROSPERO条目开始，并且没有筛选出纳入的试验时，自动化方法可以减少工作量，但仍需要额外的流程来有效识别符合系统审查纳入标准的试验注册或试验文章。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Research Synthesis Methods MATHEMATICAL & COMPUTATIONAL BIOLOGYMULTID-MULTIDISCIPLINARY SCIENCES

CiteScore

16.90

自引率

3.10%

发文量

期刊介绍： Research Synthesis Methods is a reputable, peer-reviewed journal that focuses on the development and dissemination of methods for conducting systematic research synthesis. Our aim is to advance the knowledge and application of research synthesis methods across various disciplines. Our journal provides a platform for the exchange of ideas and knowledge related to designing, conducting, analyzing, interpreting, reporting, and applying research synthesis. While research synthesis is commonly practiced in the health and social sciences, our journal also welcomes contributions from other fields to enrich the methodologies employed in research synthesis across scientific disciplines. By bridging different disciplines, we aim to foster collaboration and cross-fertilization of ideas, ultimately enhancing the quality and effectiveness of research synthesis methods. Whether you are a researcher, practitioner, or stakeholder involved in research synthesis, our journal strives to offer valuable insights and practical guidance for your work.