Artificial Intelligence Search Tools for Evidence Synthesis: Comparative Analysis and Implementation Recommendations

Cochrane Evidence Synthesis and Methods Pub Date : 2025-09-08 DOI:10.1002/cesm.70045

Robin Featherstone, Melissa Walter, Danielle MacDougall, Eric Morenz, Sharon Bailey, Robyn Butcher, Caitlyn Ford, Hannah Loshak, David Kaunelis

{"title":"Artificial Intelligence Search Tools for Evidence Synthesis: Comparative Analysis and Implementation Recommendations","authors":"Robin Featherstone, Melissa Walter, Danielle MacDougall, Eric Morenz, Sharon Bailey, Robyn Butcher, Caitlyn Ford, Hannah Loshak, David Kaunelis","doi":"10.1002/cesm.70045","DOIUrl":null,"url":null,"abstract":"<p>To inform implementation recommendations for novel or emerging technologies, Research Information Services at Canada's Drug Agency conducted a multimodal research project involving a literature review, a retrospective comparative analysis, and a focus group on 3 Artificial Intelligence (AI) or automation tools for information retrieval (AI search tools): Lens.org, SpiderCite, and Microsoft Copilot. For the comparative analysis, the customary information retrieval practices used at Canada's Drug Agency served as our reference standard for comparison, and we used the eligible studies of 7 completed projects to measure tool performance. For searches conducted with our usual practice approaches and with each of the 3 tools, we calculated sensitivity/recall, number needed to read (NNR), time to search and screen, unique contributions, and the likely impact of the unique contributions on the projects’ findings. Our investigation confirmed that AI search tools have inconsistent and variable performance for the range of information retrieval tasks performed at Canada's Drug Agency. Implementation recommendations from this study informed a “fit for purpose” approach where Information Specialists leverage AI search tools for specific tasks or project types.</p>","PeriodicalId":100286,"journal":{"name":"Cochrane Evidence Synthesis and Methods","volume":"3 5","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cesm.70045","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cochrane Evidence Synthesis and Methods","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/cesm.70045","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

To inform implementation recommendations for novel or emerging technologies, Research Information Services at Canada's Drug Agency conducted a multimodal research project involving a literature review, a retrospective comparative analysis, and a focus group on 3 Artificial Intelligence (AI) or automation tools for information retrieval (AI search tools): Lens.org, SpiderCite, and Microsoft Copilot. For the comparative analysis, the customary information retrieval practices used at Canada's Drug Agency served as our reference standard for comparison, and we used the eligible studies of 7 completed projects to measure tool performance. For searches conducted with our usual practice approaches and with each of the 3 tools, we calculated sensitivity/recall, number needed to read (NNR), time to search and screen, unique contributions, and the likely impact of the unique contributions on the projects’ findings. Our investigation confirmed that AI search tools have inconsistent and variable performance for the range of information retrieval tasks performed at Canada's Drug Agency. Implementation recommendations from this study informed a “fit for purpose” approach where Information Specialists leverage AI search tools for specific tasks or project types.

Abstract Image

查看原文本刊更多论文

用于证据合成的人工智能搜索工具：比较分析和实施建议

为了向新技术或新兴技术的实施建议提供信息，加拿大药品管理局的研究信息服务部门开展了一项多模式研究项目，包括文献综述、回顾性比较分析和3个人工智能（AI）或信息检索自动化工具（AI搜索工具）的焦点小组：Lens.org、SpiderCite和Microsoft Copilot。对于比较分析，加拿大药品管理局使用的习惯信息检索实践作为我们比较的参考标准，我们使用7个已完成项目的合格研究来衡量工具的性能。对于使用我们通常的实践方法和3种工具中的每一种进行的搜索，我们计算了灵敏度/召回率、需要阅读的数量（NNR）、搜索和筛选时间、独特贡献以及独特贡献对项目结果的可能影响。我们的调查证实，人工智能搜索工具在加拿大药品管理局执行的一系列信息检索任务中具有不一致和可变的性能。该研究的实施建议为信息专家利用人工智能搜索工具完成特定任务或项目类型提供了“适合目的”的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Cochrane Evidence Synthesis and Methods

自引率

0.00%

发文量