Huw Wilson, Anton Schönstein, Sarah Robson, Federico Bonofiglio
{"title":"A novel approach for visualizing local consistency in network meta-analysis.","authors":"Huw Wilson, Anton Schönstein, Sarah Robson, Federico Bonofiglio","doi":"10.1017/rsm.2026.10082","DOIUrl":"https://doi.org/10.1017/rsm.2026.10082","url":null,"abstract":"<p><p>Network meta-analysis is the established method to pool evidence from multiple clinical trials and make direct and indirect comparisons between different treatments. To ensure its validity, one of the major assumptions requiring examination is that the different sources of information are consistent, which is to say that the direct and indirect effect estimates agree. There are at least three different aspects to consider: (1) the original effect sizes of the direct and indirect treatment effects and their relative contribution to the total evidence; (2) the difference between them and its associated uncertainty/significance; and (3) the type of difference between them, that is, whether the direct and indirect estimates agree that a treatment is beneficial or harmful. Current visualization approaches typically use forest plots or heat maps, but these are limited as at least one of the above aspects is usually absent. Furthermore, as the number of treatments in the network increases, these visualizations can become difficult to understand. We present a visualization that combines the three aspects without being too difficult to interpret, outline the mathematical background and provide the code to produce it in R.</p>","PeriodicalId":226,"journal":{"name":"Research Synthesis Methods","volume":" ","pages":"1-15"},"PeriodicalIF":6.1,"publicationDate":"2026-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147626472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Trevor Riley, Sarah Young, Avery Paxton, Lukas Wallrich, Kaitlyn Hair, Matthew Grainger
{"title":"CiteSource: An R package for data-driven search strategy development and enhanced evidence synthesis reporting.","authors":"Trevor Riley, Sarah Young, Avery Paxton, Lukas Wallrich, Kaitlyn Hair, Matthew Grainger","doi":"10.1017/rsm.2026.10084","DOIUrl":"https://doi.org/10.1017/rsm.2026.10084","url":null,"abstract":"<p><p>Evidence synthesis findings hinge upon well-designed, effective search strategies. When developing these strategies, evidence synthesis teams make multiple decisions (e.g., selecting information sources, developing search string architecture, and picking supplementary search methods) that directly affect the breadth of discovered evidence and thus evidence synthesis outcomes. Despite the number of decisions required when developing search strategies, limited guidance exists to inform these decisions using a data-driven approach. To help address this gap, we developed CiteSource, an R package and accompanying Shiny application, that supports data-driven search strategy development and reporting. CiteSource allows users to assign and retain metadata across three custom fields: <i>source, label</i>, and <i>string</i> to indicate where the records were found, what method or string was used to find them, and whether they were included after screening. CiteSource allows users to visually map the overlap between sets of records, create data summaries of citation records, and export citation records with the newly assigned metadata. CiteSource's analysis and visualization outputs can be harnessed for a variety of use cases, such as optimizing literature source selection, honing and understanding the effectiveness of search strings, and evaluating the impacts of literature sources and supplementary search methods. Overall, CiteSource provides a tool for evidence synthesizers to make informed data-driven decisions that boost the efficiency, rigor, and transparency of search strategies and associated reporting.</p>","PeriodicalId":226,"journal":{"name":"Research Synthesis Methods","volume":" ","pages":"1-19"},"PeriodicalIF":6.1,"publicationDate":"2026-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147621248","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Meta-analytic pooling of intraclass correlation coefficient estimates.","authors":"Bethany H Bhat, S Natasha Beretvas","doi":"10.1017/rsm.2026.10077","DOIUrl":"https://doi.org/10.1017/rsm.2026.10077","url":null,"abstract":"<p><p>Intraclass correlation coefficient (ICC) estimates are necessary for several statistical techniques. Researchers need accurate ICC estimates when conducting prospective power analyses for clustered data scenarios. In addition, meta-analysts require reasonable ICC values when adjusting effect size estimates to account for clustered primary study data or to correct for psychometric artifacts when using the ICC as a reliability measure. The validity of these analyses hinges on the accuracy of the ICC estimate. Beyond these secondary analyses, ICC estimates have been used as the focal outcome of meta-analysis itself to obtain a pooled measure of agreement, reliability, or the influence of a cluster's effect. This study evaluates how well meta-analytically pooled ICC estimates recover the population ICC parameter value when using different ICC variance formulas as the inverse variance weights used in the pooling. We found that the variance formula that uses a normalizing transformation performs best across most conditions.</p>","PeriodicalId":226,"journal":{"name":"Research Synthesis Methods","volume":" ","pages":"1-34"},"PeriodicalIF":6.1,"publicationDate":"2026-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147571485","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Meta-analytic-predictive priors based on a single study.","authors":"Christian Röver, Tim Friede","doi":"10.1017/rsm.2026.10081","DOIUrl":"https://doi.org/10.1017/rsm.2026.10081","url":null,"abstract":"<p><p>Meta-analytic-predictive (MAP) priors have been proposed as a generic approach to deriving informative prior distributions, where external empirical data are processed to learn about certain parameter distributions. The use of MAP priors is also closely related to shrinkage estimation (also sometimes referred to as <i>dynamic borrowing</i>). A potentially odd situation arises when the external data consist only of <i>a single study</i>. Conceptually, this is not a problem, it only implies that certain prior assumptions gain in importance and need to be specified with particular care. We outline this important, not uncommon special case and demonstrate its implementation and interpretation based on the normal-normal hierarchical model. The approach is illustrated using example applications in clinical medicine.</p>","PeriodicalId":226,"journal":{"name":"Research Synthesis Methods","volume":" ","pages":"1-19"},"PeriodicalIF":6.1,"publicationDate":"2026-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147502632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Raphaël Bentegeac, Bastien Le Guellec, Victor Leblanc, Rémi Lenain, Luc Dauchet, Victoria Gauthier, Erwin Gerard, Emmanuel Chazard, Philippe Amouyel, Estelle Aymes, Aghilès Hamroun
{"title":"BibliZap: An exploratory evaluation of an automated multi-level citation searching tool for systematic and rapid reviews.","authors":"Raphaël Bentegeac, Bastien Le Guellec, Victor Leblanc, Rémi Lenain, Luc Dauchet, Victoria Gauthier, Erwin Gerard, Emmanuel Chazard, Philippe Amouyel, Estelle Aymes, Aghilès Hamroun","doi":"10.1017/rsm.2026.10079","DOIUrl":"https://doi.org/10.1017/rsm.2026.10079","url":null,"abstract":"<p><p>The exponential growth of scientific literature poses increasing challenges for evidence synthesis. Systematic reviews (SRs) usually rely on keyword-based database searches, which are limited by inconsistent terminology and indexing delays. Citation searching-identifying studies that cite or are cited by known relevant articles-offers a complementary route to uncover additional evidence but remains poorly automated and integrated into screening workflows. We developed BibliZap, an open-source, fully automated citation-searching tool built on Lens.org data, performing multi-level forward and backward citation searches with relevance-based ranking. Its performance was evaluated across 66 published SRs, comparing five approaches: (1) PubMed-only searches; (2) PubMed followed by BibliZap restricted to the top 500 ranked results; (3) PubMed followed by full BibliZap screening; and (4-5) two exploratory early-stop strategies where BibliZap was initiated after identifying the first or the first three PubMed relevant records. The primary outcome was sensitivity, with secondary assessments of screening workload and precision. When used after PubMed screening, BibliZap increased mean sensitivity from 75% to 97%, achieving complete recall in over half of the reviews. Screening only the top 500 outputs still allowed over 90% of reviews to reach or exceed 80% recall. BibliZap recovered a median of three additional included articles per review, not retrieved by PubMed, while adding a median of 6,450 additional records. Citation searching via BibliZap enhances the completeness of evidence retrieval in SRs based on restricted database searches and supports transparent, scalable workflows adaptable to rapid and exploratory review contexts.</p>","PeriodicalId":226,"journal":{"name":"Research Synthesis Methods","volume":" ","pages":"1-14"},"PeriodicalIF":6.1,"publicationDate":"2026-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147502679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tianqi Yu, Silvia Metelli, Theodoros Papakonstantinou, Anna Chaimani
{"title":"NMAstudio 2.0: An interactive tool for network meta-analysis to enhance understanding, interpretation, and communication of the findings.","authors":"Tianqi Yu, Silvia Metelli, Theodoros Papakonstantinou, Anna Chaimani","doi":"10.1017/rsm.2026.10074","DOIUrl":"https://doi.org/10.1017/rsm.2026.10074","url":null,"abstract":"<p><p>Network meta-analysis (NMA) is a vital methodology for synthesizing evidence across multiple treatments and informing medical decision-making. However, effective visualization and interpretation of results from large networks of interventions remain challenging, particularly for non-specialists. NMAstudio 2.0 is an innovative, interactive web application designed to address these difficulties by streamlining NMA workflows and enhancing result visualization. Developed using Python and R, NMAstudio 2.0 seamlessly integrates with established NMA frameworks. Our exemplar application of NMAstudio 2.0 using a Cochrane Review comparing several treatments for chronic plaque psoriasis demonstrates its capacity to facilitate all crucial steps of an NMA. The application features an intuitive interface for uploading data, automating analyses, generating interactive visualizations such as network diagrams, forest plots, ranking plots, and producing unique outputs like boxplots for transitivity checks and bidimensional forest plots. Most outputs are dynamically linked with the network diagram, enabling users to interactively explore evidence networks, apply advanced filtering, and highlight specific features by selecting nodes or edges within the diagram. While NMAstudio 2.0 aims to simplify NMAs, it also incorporates steps during the data upload process to mitigate the risk of producing poorly reported NMAs. NMAstudio 2.0 represents a significant step forward in improving the usability and accessibility of NMA, offering researchers a robust, versatile platform for evidence synthesis. Its integration of advanced features with an emphasis on user experience positions it as a valuable resource for enhancing decision-making and promoting evidence-based practice across diverse contexts.</p>","PeriodicalId":226,"journal":{"name":"Research Synthesis Methods","volume":" ","pages":"1-14"},"PeriodicalIF":6.1,"publicationDate":"2026-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147363631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dazheng Zhang, Bingyu Zhang, Lu Li, Haitao Chu, Yong Chen
{"title":"A causal meta-analysis framework for clinical trials with unequal randomization ratios.","authors":"Dazheng Zhang, Bingyu Zhang, Lu Li, Haitao Chu, Yong Chen","doi":"10.1017/rsm.2025.10069","DOIUrl":"https://doi.org/10.1017/rsm.2025.10069","url":null,"abstract":"<p><p>Meta-analysis synthesizes evidence from multiple randomized clinical trials and informs evidence-based practices across various medical domains. Recently, causally interpretable meta-analysis has been proposed and applied to treatment evaluations for target populations, requiring individual participant data (IPD). Standard meta-analysis assumes transportability or exchangeability of a (conditional) relative effect (such as relative risk or odds ratio), which may be violated when the relative effects are correlated with the baseline risks across clinical trials. In addition, the weighted average of some study-specific effect measures such as the (log) odds ratios or the (log) hazard ratios is non-collapsible and does not correspond to any target population. Furthermore, when the randomization ratios between treated versus untreated arms vary across trials, confounding bias may occur. To address these challenges, we propose a causal meta-analysis (CMA) framework using only aggregated data, enabling causally interpretable and accurate estimation for different target populations. The CMA adjusts its weights for treatment effect across various target populations, including the average treatment effect (ATE), the ATE on the treated (ATT) population, the ATE on the control (ATC) population, and the ATE in the overlap (ATO) population. Mathematically, we discover the connection between traditional meta-analysis estimators and CMAs. For example, the Mantel-Haenszel weighted meta-analysis is equivalent to the CMA with ATO.</p>","PeriodicalId":226,"journal":{"name":"Research Synthesis Methods","volume":" ","pages":"1-12"},"PeriodicalIF":6.1,"publicationDate":"2026-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147352911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Javier Bracchiglione, Nicolás Meza, Dawid Pieper, Carole Lunny, Manuel Vargas-Peirano, Johanna Vicuña, Fernando Briceño, Roberto Garnham Parra, Ignacio Pérez Carrasco, Gerard Urrútia, Xavier Bonfill, Eva Madrid
{"title":"Impact of matrix-construction assumptions on quantitative overlap assessment in overviews: A meta-research study.","authors":"Javier Bracchiglione, Nicolás Meza, Dawid Pieper, Carole Lunny, Manuel Vargas-Peirano, Johanna Vicuña, Fernando Briceño, Roberto Garnham Parra, Ignacio Pérez Carrasco, Gerard Urrútia, Xavier Bonfill, Eva Madrid","doi":"10.1017/rsm.2025.10056","DOIUrl":"10.1017/rsm.2025.10056","url":null,"abstract":"<p><p>Overlap of primary studies among multiple systematic reviews (SRs) is a major challenge when conducting overviews. The corrected covered area (CCA) is a metric computed from a matrix of evidence that quantifies overlap. Therefore, the assumptions used to generate the matrix may significantly affect the CCA. We aim to explore how these varying assumptions influence CCA calculations. We searched two databases for intervention-focused overviews published during 2023. Two reviewers conducted study selection and data extraction. We extracted overview characteristics and methods to handle overlap. For seven sampled overviews, we calculated overall and pairwise CCA across 16 scenarios, representing four matrix-construction assumptions. Of 193 included overviews, only 23 (11.9%) adhered to an overview-specific reporting guideline (e.g. PRIOR). Eighty-five (44.0%) did not address overlap; 14 (7.3%) only mentioned it in the discussion; and 94 (48.7%) incorporated it into methods or results (38 using CCA). Among the seven sampled overviews, CCA values varied depending on matrix-construction assumptions, ranging from 1.2% to 13.5% with the overall method and 0.0% to 15.7% with the pairwise method. CCA values may vary depending on the assumptions made during matrix construction, including scope, treatment of structural missingness, and handling of publication threads. This variability calls into question the uncritical use of current CCA thresholds and underscores the need for overview authors to report both overall and pairwise CCA calculations. Our preliminary guidance for transparently reporting matrix-construction assumptions may improve the accuracy and reproducibility of CCA assessments.</p>","PeriodicalId":226,"journal":{"name":"Research Synthesis Methods","volume":"17 2","pages":"348-364"},"PeriodicalIF":6.1,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12873615/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146111564","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Antonio Sciurti, Giuseppe Migliara, Leonardo Maria Siena, Claudia Isonne, Maria Roberta De Blasiis, Alessandra Sinopoli, Jessica Iera, Carolina Marzuillo, Corrado De Vito, Paolo Villari, Valentina Baccolini
{"title":"Compact large language models for title and abstract screening in systematic reviews: An assessment of feasibility, accuracy, and workload reduction.","authors":"Antonio Sciurti, Giuseppe Migliara, Leonardo Maria Siena, Claudia Isonne, Maria Roberta De Blasiis, Alessandra Sinopoli, Jessica Iera, Carolina Marzuillo, Corrado De Vito, Paolo Villari, Valentina Baccolini","doi":"10.1017/rsm.2025.10044","DOIUrl":"10.1017/rsm.2025.10044","url":null,"abstract":"<p><p>Systematic reviews play a critical role in evidence-based research but are labor-intensive, especially during title and abstract screening. Compact large language models (LLMs) offer potential to automate this process, balancing time/cost requirements and accuracy. The aim of this study is to assess the feasibility, accuracy, and workload reduction by three compact LLMs (GPT-4o mini, Llama 3.1 8B, and Gemma 2 9B) in screening titles and abstracts. Records were sourced from three previously published systematic reviews and LLMs were requested to rate each record from 0 to 100 for inclusion, using a structured prompt. Predefined 25-, 50-, 75-rating thresholds were used to compute performance metrics (balanced accuracy, sensitivity, specificity, positive and negative predictive value, and workload-saving). Processing time and costs were registered. Across the systematic reviews, LLMs achieved high sensitivity (up to 100%) and low precision (below 10%) for records included by full text. Specificity and workload savings improved at higher thresholds, with the 50- and 75-rating thresholds offering optimal trade-offs. GPT-4o-mini, accessed via application programming interface, was the fastest model (~40 minutes max.) and had usage costs ($0.14-$1.93 per review). Llama 3.1-8B and Gemma 2-9B were run locally in longer times (~4 hours max.) and were free to use. LLMs were highly sensitive tools for the title/abstract screening process. High specificity values were reached, allowing for significant workload savings, at reasonable costs and processing time. Conversely, we found them to be imprecise. However, high sensitivity and workload reduction are key factors for their usage in the title/abstract screening phase of systematic reviews.</p>","PeriodicalId":226,"journal":{"name":"Research Synthesis Methods","volume":"17 2","pages":"332-347"},"PeriodicalIF":6.1,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12873614/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146111628","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Delphine S Courvoisier, Diana Buitrago-Garcia, Clément P Buclin, Nils Bürgisser, Michele Iudici, Denis Mongin
{"title":"Beyond human gold standards: A multimodel framework for automated abstract classification and information extraction.","authors":"Delphine S Courvoisier, Diana Buitrago-Garcia, Clément P Buclin, Nils Bürgisser, Michele Iudici, Denis Mongin","doi":"10.1017/rsm.2025.10054","DOIUrl":"10.1017/rsm.2025.10054","url":null,"abstract":"<p><p>Meta-research and evidence synthesis require considerable resources. Large language models (LLMs) have emerged as promising tools to assist in these processes, yet their performance varies across models, limiting their reliability. Taking advantage of the large availability of small size (<10 billion parameters) open-source LLMs, we implemented an agreement-based framework in which a decision is taken only if at least a given number of LLMs produce the same response. The decision is otherwise withheld. This approach was tested on 1020 abstracts of randomized controlled trials in rheumatology, using 2 classic literature review tasks: (1) classifying each intervention as drug or nondrug based on text interpretation and (2) extracting the total number of randomized patients, a task that sometimes required calculations. Re-examining abstracts where at least 4 LLMs disagreed with the human gold standard (dual review with adjudication) allowed constructing an improved gold standard. Compared to a human gold standard and single large LLMs (>70 billion parameters), our framework demonstrated robust performance: several model combinations achieved accuracies above 95% exceeding the human gold standard on at least 85% of abstracts (e.g., 3 of 5 models, 4 of 6 models, or 5 of 7 models). Performance variability across individual models was not an issue, as low-performing models contributed fewer accepted decisions. This agreement-based framework offers a scalable solution that can replace human reviewers for most abstracts, reserving human expertise for more complex cases. Such frameworks could significantly reduce the manual burden in systematic reviews while maintaining high accuracy and reproducibility.</p>","PeriodicalId":226,"journal":{"name":"Research Synthesis Methods","volume":"17 2","pages":"365-377"},"PeriodicalIF":6.1,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12873610/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146111549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}