Cochrane Evidence Synthesis and Methods最新文献

筛选
英文 中文
Leveraging AI for Meta-Analysis: Evaluating LLMs in Detecting Publication Bias for Next-Generation Evidence Synthesis 利用人工智能进行荟萃分析:评估法学硕士在检测下一代证据合成的发表偏倚方面的作用
Cochrane Evidence Synthesis and Methods Pub Date : 2025-09-18 DOI: 10.1002/cesm.70047
Xing Xing, Lifeng Lin, Mohammad Hassan Murad, Jiayi Tong
{"title":"Leveraging AI for Meta-Analysis: Evaluating LLMs in Detecting Publication Bias for Next-Generation Evidence Synthesis","authors":"Xing Xing,&nbsp;Lifeng Lin,&nbsp;Mohammad Hassan Murad,&nbsp;Jiayi Tong","doi":"10.1002/cesm.70047","DOIUrl":"https://doi.org/10.1002/cesm.70047","url":null,"abstract":"<div>\u0000 \u0000 \u0000 <section>\u0000 \u0000 <h3> Introduction</h3>\u0000 \u0000 <p>Publication bias (PB) threatens the validity of meta-analyses by distorting effect size estimates, potentially leading to misleading conclusions. With advanced pattern recognition and multimodal capabilities, large language models (LLMs) may be able to evaluate PB and make the systematic review process more efficient.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Methods</h3>\u0000 \u0000 <p>We evaluated the ability of two state-of-the-art multimodal LLMs, GPT-4o and Llama 3.2 Vision, to detect PB using funnel plots alone and in combination with quantitative inputs. We simulated meta-analyses under varying conditions, including the absence of PB, different levels of presence of PB, varying total number of studies within a meta-analysis, and differing degrees of between-study heterogeneity.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Results</h3>\u0000 \u0000 <p>Neither GPT-4o nor Llama 3.2 Vision consistently detected the presence of PB across various settings. Under no-publication-bias conditions, GPT-4o achieved a higher specificity outperforming Llama 3.2 Vision, with the difference most shown in the meta-analyses with 20 or more studies. The inclusion of quantitative inputs alongside funnel plots did not significantly improve performance. Additionally, between-study heterogeneity and patterns of non-reported studies had minimal impact on the models’ assessments.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Conclusions</h3>\u0000 \u0000 <p>The ability of LLMs to detect PB without fine-tuning is limited at the present time. This study highlights the need for specialized model adaptation before LLMs can be effectively integrated into meta-analysis workflows. Future research can focus on targeted refinements to enhance LLM performance and utility in evidence synthesis.</p>\u0000 </section>\u0000 </div>","PeriodicalId":100286,"journal":{"name":"Cochrane Evidence Synthesis and Methods","volume":"3 5","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cesm.70047","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145101484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Retiring the Term “Weighted Mean Difference” in Contemporary Evidence Synthesis 退出当代证据综合中的“加权平均差”一词
Cochrane Evidence Synthesis and Methods Pub Date : 2025-09-11 DOI: 10.1002/cesm.70051
Lifeng Lin, Xing Xing, Wenshan Han, Jiayi Tong
{"title":"Retiring the Term “Weighted Mean Difference” in Contemporary Evidence Synthesis","authors":"Lifeng Lin,&nbsp;Xing Xing,&nbsp;Wenshan Han,&nbsp;Jiayi Tong","doi":"10.1002/cesm.70051","DOIUrl":"https://doi.org/10.1002/cesm.70051","url":null,"abstract":"&lt;p&gt;Evidence synthesis frequently involves quantitative analyses of continuous outcomes. A cross-sectional study examining Cochrane systematic reviews identified 6672 out of 22,453 meta-analyses (29.7%) involved continuous outcomes [&lt;span&gt;1&lt;/span&gt;]. The primary effect measures employed in meta-analyses of continuous outcomes are the mean difference (MD) and standardized mean difference (SMD) [&lt;span&gt;2&lt;/span&gt;]. The MD is appropriately applied when all included studies measure outcomes using identical scales (e.g., body weight in kilograms). In contrast, the SMD serves as a solution when studies utilize different measurement scales (e.g., varied questionnaire scoring methods). Although alternative measures (e.g., the ratio of means) exist [&lt;span&gt;3&lt;/span&gt;], their application remains relatively infrequent.&lt;/p&gt;&lt;p&gt;Despite this conceptual clarity, the term “weighted mean difference” (WMD) appears frequently in the systematic review literature [&lt;span&gt;4&lt;/span&gt;], which can lead to confusion about its relationship to the MD. In this article, we first clarify the distinction between MD and WMD, then describe the historical factors underlying the term's adoption and persistence, discuss why contemporary methods render it unnecessary, illustrate examples of misuse, and conclude with practical recommendations for clearer reporting.&lt;/p&gt;&lt;p&gt;The MD represents the straightforward difference between group means (e.g., intervention vs. control) for a continuous outcome. Although the true MD value relates to unknown population-level differences, practical research relies on sample estimates from individual studies. Meta-analysis systematically synthesizes these study-level MD estimates to derive an overall summary effect across studies.&lt;/p&gt;&lt;p&gt;The term WMD emerged historically to emphasize the weighted averaging process of meta-analyses, wherein each study contributes a sample MD weighted by its statistical precision (i.e., inverse variance) [&lt;span&gt;5&lt;/span&gt;]. Typically, larger studies with smaller variances or narrower confidence intervals are assigned greater weights. Traditional meta-analytical methods, performed through either fixed-effect (also known as common-effect) or random-effects models, follow this inverse-variance weighting principle. Under fixed-effect models, study weights directly reflect the inverse of their variances, whereas random-effects models incorporate both within-study and between-study variances.&lt;/p&gt;&lt;p&gt;To contextualize the widespread adoption of WMD, we conducted a brief literature search using Google Scholar on June 12, 2025. Using exact-phrase queries in quotation marks, for each calendar year from 1990 to 2024, we recorded the counts for “weighted mean difference” AND “systematic review” and separately for “systematic review,” then calculated the yearly proportion (Figure 1). Google Scholar indexes titles, abstracts, and, when available, full texts, so counts reflect occurrences anywhere in the indexed record, and these counts are approximate.","PeriodicalId":100286,"journal":{"name":"Cochrane Evidence Synthesis and Methods","volume":"3 5","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cesm.70051","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145037868","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Using a Large Language Model (ChatGPT-4o) to Assess the Risk of Bias in Randomized Controlled Trials of Medical Interventions: Interrater Agreement With Human Reviewers 使用大型语言模型(chatgpt - 40)评估医学干预随机对照试验中的偏倚风险:与人类审稿人的审稿人一致
Cochrane Evidence Synthesis and Methods Pub Date : 2025-09-10 DOI: 10.1002/cesm.70048
Christopher James Rose, Julia Bidonde, Martin Ringsten, Julie Glanville, Thomas Potrebny, Chris Cooper, Ashley Elizabeth Muller, Hans Bugge Bergsund, Jose F. Meneses-Echavez, Rigmor C. Berg
{"title":"Using a Large Language Model (ChatGPT-4o) to Assess the Risk of Bias in Randomized Controlled Trials of Medical Interventions: Interrater Agreement With Human Reviewers","authors":"Christopher James Rose,&nbsp;Julia Bidonde,&nbsp;Martin Ringsten,&nbsp;Julie Glanville,&nbsp;Thomas Potrebny,&nbsp;Chris Cooper,&nbsp;Ashley Elizabeth Muller,&nbsp;Hans Bugge Bergsund,&nbsp;Jose F. Meneses-Echavez,&nbsp;Rigmor C. Berg","doi":"10.1002/cesm.70048","DOIUrl":"https://doi.org/10.1002/cesm.70048","url":null,"abstract":"<div>\u0000 \u0000 \u0000 <section>\u0000 \u0000 <h3> Background</h3>\u0000 \u0000 <p>Risk of bias (RoB) assessment is a highly skilled task that is time-consuming and subject to human error. RoB automation tools have previously used machine learning models built using relatively small task-specific training sets. Large language models (LLMs; e.g., ChatGPT) are complex models built using non-task-specific Internet-scale training sets. They demonstrate human-like abilities and might be able to support tasks like RoB assessment.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Methods</h3>\u0000 \u0000 <p>Following a published peer-reviewed protocol, we randomly sampled 100 Cochrane reviews. New or updated reviews that evaluated medical interventions, included ≥ 1 eligible trial, and presented human consensus assessments using Cochrane RoB1 or RoB2 were eligible. We excluded reviews performed under emergency conditions (e.g., COVID-19), and those on public health or welfare. We randomly sampled one trial from each review. Trials using individual- or cluster-randomized designs were eligible. We extracted human consensus RoB assessments of the trials from the reviews, and methods texts from the trials. We used 25 review-trial pairs to develop a ChatGPT prompt to assess RoB using trial methods text. We used the prompt and the remaining 75 review-trial pairs to estimate human-ChatGPT agreement for “Overall RoB” (primary outcome) and “RoB due to the randomization process”, and ChatGPT-ChatGPT (intrarater) agreement for “Overall RoB”. We used ChatGPT-4o (February 2025) throughout.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Results</h3>\u0000 \u0000 <p>The 75 reviews were sampled from 35 Cochrane review groups, and all used RoB1. The 75 trials spanned five decades, and all but one were published in English. Human-ChatGPT agreement for “Overall RoB” assessment was 50.7% (95% CI 39.3%–62.0%), substantially higher than expected by chance (<i>p</i> = 0.0015). Human-ChatGPT agreement for “RoB due to the randomization process” was 78.7% (95% CI 69.4%–88.0%; <i>p</i> &lt; 0.001). ChatGPT-ChatGPT agreement was 74.7% (95% CI 64.8%–84.6%; <i>p</i> &lt; 0.001).</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Conclusions</h3>\u0000 \u0000 <p>ChatGPT appears to have some ability to assess RoB and is unlikely to be guessing or “hallucinating”. The estimated agreement for “Overall RoB” is well above estimates of agreement reported for some human reviewers, but below the highest estimates. LLM-based systems for assessing RoB may be able to help streamline and improve evidence synthesis production.</p>\u0000 </section>\u0000 </div>","PeriodicalId":100286,"journal":{"name":"Cochrane Evidence Synthesis and Methods","volume":"3 5","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cesm.70048","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145021965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Artificial Intelligence Search Tools for Evidence Synthesis: Comparative Analysis and Implementation Recommendations 用于证据合成的人工智能搜索工具:比较分析和实施建议
Cochrane Evidence Synthesis and Methods Pub Date : 2025-09-08 DOI: 10.1002/cesm.70045
Robin Featherstone, Melissa Walter, Danielle MacDougall, Eric Morenz, Sharon Bailey, Robyn Butcher, Caitlyn Ford, Hannah Loshak, David Kaunelis
{"title":"Artificial Intelligence Search Tools for Evidence Synthesis: Comparative Analysis and Implementation Recommendations","authors":"Robin Featherstone,&nbsp;Melissa Walter,&nbsp;Danielle MacDougall,&nbsp;Eric Morenz,&nbsp;Sharon Bailey,&nbsp;Robyn Butcher,&nbsp;Caitlyn Ford,&nbsp;Hannah Loshak,&nbsp;David Kaunelis","doi":"10.1002/cesm.70045","DOIUrl":"https://doi.org/10.1002/cesm.70045","url":null,"abstract":"<p>To inform implementation recommendations for novel or emerging technologies, Research Information Services at Canada's Drug Agency conducted a multimodal research project involving a literature review, a retrospective comparative analysis, and a focus group on 3 Artificial Intelligence (AI) or automation tools for information retrieval (AI search tools): Lens.org, SpiderCite, and Microsoft Copilot. For the comparative analysis, the customary information retrieval practices used at Canada's Drug Agency served as our reference standard for comparison, and we used the eligible studies of 7 completed projects to measure tool performance. For searches conducted with our usual practice approaches and with each of the 3 tools, we calculated sensitivity/recall, number needed to read (NNR), time to search and screen, unique contributions, and the likely impact of the unique contributions on the projects’ findings. Our investigation confirmed that AI search tools have inconsistent and variable performance for the range of information retrieval tasks performed at Canada's Drug Agency. Implementation recommendations from this study informed a “fit for purpose” approach where Information Specialists leverage AI search tools for specific tasks or project types.</p>","PeriodicalId":100286,"journal":{"name":"Cochrane Evidence Synthesis and Methods","volume":"3 5","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cesm.70045","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145012413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring the Role of Artificial Intelligence in Evidence Synthesis: Insights From the CORE Information Retrieval Forum 2025 探讨人工智能在证据合成中的作用:来自2025年CORE信息检索论坛的见解
Cochrane Evidence Synthesis and Methods Pub Date : 2025-09-07 DOI: 10.1002/cesm.70049
Claire H. Eastaugh, Madeleine Still, Fiona R. Beyer, Sheila A. Wallace, Hannah O'Keefe
{"title":"Exploring the Role of Artificial Intelligence in Evidence Synthesis: Insights From the CORE Information Retrieval Forum 2025","authors":"Claire H. Eastaugh,&nbsp;Madeleine Still,&nbsp;Fiona R. Beyer,&nbsp;Sheila A. Wallace,&nbsp;Hannah O'Keefe","doi":"10.1002/cesm.70049","DOIUrl":"https://doi.org/10.1002/cesm.70049","url":null,"abstract":"<div>\u0000 \u0000 \u0000 <section>\u0000 \u0000 <h3> Introduction</h3>\u0000 \u0000 <p>Information retrieval is essential for evidence synthesis, but developing search strategies can be labor-intensive and time-consuming. Automating these processes would be of benefit and interest, though it is unclear if Information Specialists (IS) are willing to adopt artificial intelligence (AI) methodologies or how they currently use them. In January 2025, the NIHR Innovation Observatory and NIHR Methodology Incubator for Applied Health and Care Research co-sponsored the inaugural CORE Information Retrieval Forum, where attendees discussed AI's role in information retrieval.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Methods</h3>\u0000 \u0000 <p>The CORE Information Retrieval Forum hosted a Knowledge Café. Participation was voluntary, and attendees could choose one of six event-themed discussion tables including AI. To support each discussion, a QR code linking to a virtual collaboration tool (Padlet; padlet.com) and a poster in the exhibition space were available throughout the day for attendee contributions.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Results</h3>\u0000 \u0000 <p>The CORE Information Retrieval Forum was attended by 131 IS from nine different types of organizations, with most from the UK and ten countries represented overall. Among the six discussion points available in the Knowledge Café, the AI table was the most popular, receiving the highest number of contributions (<i>n</i> = 49). Following the Forum, contributions to the AI topic were categorized into four themes: critical perception (<i>n</i> = 21), current uses (<i>n</i> = 19), specific tools (<i>n</i> = 2), and training wants/needs (<i>n</i> = 7).</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Conclusions</h3>\u0000 \u0000 <p>While there are critical perspectives on the integration of AI in the IS space, this is not due to a reluctance to adapt and adopt but from a need for structure, education, training, ethical guidance, and systems to support the responsible use and transparency of AI. There is interest in automating repetitive and time-consuming tasks, but attendees reported a lack of appropriate supporting tools. More work is required to identify the suitability of currently available tools and their potential to complement the work conducted by IS.</p>\u0000 </section>\u0000 </div>","PeriodicalId":100286,"journal":{"name":"Cochrane Evidence Synthesis and Methods","volume":"3 5","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cesm.70049","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145012428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Human Versus Artificial Intelligence: Comparing Cochrane Authors' and ChatGPT's Risk of Bias Assessments 人类与人工智能:比较Cochrane作者和ChatGPT的偏倚风险评估
Cochrane Evidence Synthesis and Methods Pub Date : 2025-08-31 DOI: 10.1002/cesm.70044
Petek Eylul Taneri
{"title":"Human Versus Artificial Intelligence: Comparing Cochrane Authors' and ChatGPT's Risk of Bias Assessments","authors":"Petek Eylul Taneri","doi":"10.1002/cesm.70044","DOIUrl":"https://doi.org/10.1002/cesm.70044","url":null,"abstract":"<div>\u0000 \u0000 \u0000 <section>\u0000 \u0000 <h3> Introduction</h3>\u0000 \u0000 <p>Systematic reviews and meta-analyses synthesize randomized trial data to guide clinical decisions but require significant time and resources. Artificial intelligence (AI) offers a promising solution to streamline evidence synthesis, aiding study selection, data extraction, and risk of bias assessment. This study aims to evaluate the performance of ChatGPT-4o in assessing the risk of bias in randomised controlled trials (RCTs) using the Risk of Bias 2 (RoB 2) tool, comparing its results with those conducted by human reviewers in Cochrane Reviews.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Methods</h3>\u0000 \u0000 <p>A sample of Cochrane Reviews utilizing the RoB 2 tool was identified through the Cochrane Database of Systematic Reviews (CDSR). Protocols, qualitative systematic reviews, and reviews employing alternative risk of bias assessment tools were excluded. The study utilized ChatGPT-4o to assess the risk of bias using a structured set of prompts corresponding to the RoB 2 domains. The agreement between ChatGPT-4o and consensus-based human reviewer assessments was evaluated using weighted kappa statistics. Additionally, accuracy, sensitivity, specificity, positive predictive value, and negative predictive value were calculated. All analyses were performed using R Studio (version 4.3.0).</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Results</h3>\u0000 \u0000 <p>A total of 42 Cochrane Reviews were screened, yielding a final sample of eight eligible reviews comprising 84 RCTs. The primary outcome of each included review was selected for risk of bias assessment. ChatGPT-4o demonstrated moderate agreement with human reviewers for the overall risk of bias judgments (weighted kappa = 0.51, 95% CI: 0.36–0.66). Agreement varied across domains, ranging from fair (<i>κ</i> = 0.20 for selection of the reported results) to moderate (<i>κ</i> = 0.59 for measurement of outcomes). ChatGPT-4o exhibited a sensitivity of 53% for identifying high-risk studies and a specificity of 99% for classifying low-risk studies.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Conclusion</h3>\u0000 \u0000 <p>This study shows that ChatGPT-4o can perform risk of bias assessments using RoB 2 with fair to moderate agreement with human reviewers. While AI-assisted risk of bias assessment remains imperfect, advancements in prompt engineering and model refinement may enhance performance. Future research should explore standardised prompts and investigate interrater reliability among human reviewers to provide a more robust comparison.</p>\u0000 </section>\u0000 </div>","PeriodicalId":100286,"journal":{"name":"Cochrane Evidence Synthesis and Methods","volume":"3 5","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cesm.70044","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144920563","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Artificial Intelligence and Automation in Evidence Synthesis: An Investigation of Methods Employed in Cochrane, Campbell Collaboration, and Environmental Evidence Reviews 证据合成中的人工智能和自动化:Cochrane、Campbell协作和环境证据综述中使用方法的调查
Cochrane Evidence Synthesis and Methods Pub Date : 2025-08-28 DOI: 10.1002/cesm.70046
Kristen L. Scotti, Sarah Young, Melanie A. Gainey, Haoyong Lan
{"title":"Artificial Intelligence and Automation in Evidence Synthesis: An Investigation of Methods Employed in Cochrane, Campbell Collaboration, and Environmental Evidence Reviews","authors":"Kristen L. Scotti,&nbsp;Sarah Young,&nbsp;Melanie A. Gainey,&nbsp;Haoyong Lan","doi":"10.1002/cesm.70046","DOIUrl":"https://doi.org/10.1002/cesm.70046","url":null,"abstract":"<p>Automation, including Machine Learning (ML), is increasingly being explored to reduce the time and effort involved in evidence syntheses, yet its adoption and reporting practices remain under-examined across disciplines (e.g., health sciences, education, and policy). This review assesses the use of automation, including ML-based techniques, in 2271 evidence syntheses published between 2017 and 2024 in the <i>Cochrane Database of Systematic Reviews</i>, and the journals <i>Campbell Systematic Reviews</i>, and <i>Environmental Evidence</i>. We focus on automation across four review steps: search, screening, data extraction, and analysis/synthesis. We systematically identified eligible studies from the three sources and developed a classification system to distinguish between manual, rules-based, ML-enabled, and ML-embedded tools. We then extracted data on tool use, ML integration, reporting practices, motivations for (and against) ML adoption, and the application of stopping criteria for ML-assisted screening. Only ~5% of studies explicitly reported using ML, with most applications limited to screening tasks. Although ~12% employed ML-enabled tools, ~90% of those did not clarify whether ML functionalities were actually utilized. Living reviews showed higher relative ML integration (~15%), but overall uptake remains limited. Previous work has shown that common barriers to broader adoption included limited guidance, low user awareness, and concerns over reliability. Despite ML's potential to streamline evidence syntheses, its integration remains limited and inconsistently reported. Improved transparency, clearer reporting standards, and greater user training are needed to support responsible adoption. As the research literature grows, automation will become increasingly essential—but only if challenges in usability, reproducibility, and trust are addressed.</p>","PeriodicalId":100286,"journal":{"name":"Cochrane Evidence Synthesis and Methods","volume":"3 5","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cesm.70046","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144910184","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Meta-Analysis Using Time-to-Event Data: A Tutorial 使用事件时间数据的元分析:教程
Cochrane Evidence Synthesis and Methods Pub Date : 2025-08-26 DOI: 10.1002/cesm.70041
Ashma Krishan, Kerry Dwan
{"title":"Meta-Analysis Using Time-to-Event Data: A Tutorial","authors":"Ashma Krishan,&nbsp;Kerry Dwan","doi":"10.1002/cesm.70041","DOIUrl":"https://doi.org/10.1002/cesm.70041","url":null,"abstract":"<p>This tutorial focuses on trials that assess time-to-event outcomes. We explain what hazard ratios are, how to interpret them and demonstrate how to include time-to-event data in a meta-analysis. Examples are presented to help with understanding. Accompanying the tutorial is a micro learning module, where we demonstrate a few approaches and give you the chance to practice calculating the hazard ratio. Time-to-event micro learning module https://links.cochrane.org/cesm/tutorials/time-to-event-data.</p>","PeriodicalId":100286,"journal":{"name":"Cochrane Evidence Synthesis and Methods","volume":"3 5","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cesm.70041","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144897288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Lifecycles of Cochrane Systematic Reviews (2003–2024): A Bibliographic Study Cochrane系统综述的生命周期(2003-2024):文献研究
Cochrane Evidence Synthesis and Methods Pub Date : 2025-08-17 DOI: 10.1002/cesm.70043
Shiyin Li, Chong Wu, Zichen Zhang, Mengli Xiao, Mohammad Hassan Murad, Lifeng Lin
{"title":"Lifecycles of Cochrane Systematic Reviews (2003–2024): A Bibliographic Study","authors":"Shiyin Li,&nbsp;Chong Wu,&nbsp;Zichen Zhang,&nbsp;Mengli Xiao,&nbsp;Mohammad Hassan Murad,&nbsp;Lifeng Lin","doi":"10.1002/cesm.70043","DOIUrl":"https://doi.org/10.1002/cesm.70043","url":null,"abstract":"<div>\u0000 \u0000 \u0000 <section>\u0000 \u0000 <h3> Background and Objectives</h3>\u0000 \u0000 <p>The relevance of Cochrane systematic reviews depends on timely completion and updates. This study aimed to empirically assess the lifecycles of Cochrane reviews published from 2003 to 2024, including transitions from protocol to review, update patterns, and withdrawals.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Methods</h3>\u0000 \u0000 <p>We extracted data from Cochrane Library publications between 2003 and 2024. Each review topic was identified using a unique six-digit DOI-based ID. We recorded protocol publication, review publication, updates, and withdrawals (i.e., removed from the Cochrane Library for editorial or procedural reasons), calculating time intervals between stages and conducting subgroup analyses by review type.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Results</h3>\u0000 \u0000 <p>Of 8137 protocols, 71.9% progressed to reviews (median 25.7 months), 2.4% were updated during the protocol stage, and 10.0% were withdrawn. Among 8477 reviews, 64.3% were never updated by the time of our analysis; for those updated at least once, the median interval between updates was 57.2 months. Withdrawal occurred in 2.5% of reviews (median 67.6 months post-publication). Subgroup analyses showed variation across review types; diagnostic and qualitative reviews tended to have longer protocol-to-review times than other types of reviews.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Conclusions</h3>\u0000 \u0000 <p>Cochrane reviews show long development and update intervals, with variation by review type. Greater use of automation and targeted support may improve review efficiency and timeliness.</p>\u0000 </section>\u0000 </div>","PeriodicalId":100286,"journal":{"name":"Cochrane Evidence Synthesis and Methods","volume":"3 5","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cesm.70043","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144858600","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimizing Research Impact: A Toolkit for Stakeholder-Driven Prioritization of Systematic Review Topics 优化研究影响:利益相关者驱动的系统审查主题优先排序工具包
Cochrane Evidence Synthesis and Methods Pub Date : 2025-08-14 DOI: 10.1002/cesm.70039
Dyon Hoekstra, Stefan K. Lhachimi
{"title":"Optimizing Research Impact: A Toolkit for Stakeholder-Driven Prioritization of Systematic Review Topics","authors":"Dyon Hoekstra,&nbsp;Stefan K. Lhachimi","doi":"10.1002/cesm.70039","DOIUrl":"https://doi.org/10.1002/cesm.70039","url":null,"abstract":"<div>\u0000 \u0000 \u0000 <section>\u0000 \u0000 <h3> Intro</h3>\u0000 \u0000 <p>The prioritization of topics for evidence synthesis is crucial for maximizing the relevance and impact of systematic reviews. This article introduces a comprehensive toolkit designed to facilitate a structured, multi-step framework for engaging a broad spectrum of stakeholders in the prioritization process, ensuring the selection of topics that are both relevant and applicable.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Methods</h3>\u0000 \u0000 <p>We detail an open-source framework comprising 11 coherent steps, segmented into scoping and Delphi stages, to offer a flexible and resource-efficient approach for stakeholder involvement in research priority setting.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Results</h3>\u0000 \u0000 <p>The toolkit provides ready-to-use tools for the development, application, and analysis of the framework, including templates for online surveys developed with free open-source software, ensuring ease of replication and adaptation in various research fields. The framework supports the transparent and systematic development and assessment of systematic review topics, with a particular focus on stakeholder-refined assessment criteria.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Conclusion</h3>\u0000 \u0000 <p>Our toolkit enhances the transparency and ease of the priority-setting process. Targeted primarily at organizations and research groups seeking to allocate resources for future research based on stakeholder needs, this toolkit stands as a valuable resource for informed decision-making in research prioritization.</p>\u0000 </section>\u0000 </div>","PeriodicalId":100286,"journal":{"name":"Cochrane Evidence Synthesis and Methods","volume":"3 5","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cesm.70039","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144832632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信
小红书