{"title":"An AI assistant for critically assessing and synthesizing clusters of journal articles","authors":"Louis Anthony Cox Jr.","doi":"10.1016/j.gloepi.2025.100207","DOIUrl":null,"url":null,"abstract":"<div><div>Current large language models (LLMs) face significant challenges in attempting to synthesize and critically assess conflicting causal claims in scientific literature about exposure-associated health effects. This paper examines the design and performance of AIA2, an experimental AI system (freely available at <span><span>http://cloud.cox-associates.com/</span><svg><path></path></svg></span>) designed to help explore and illustrate potential applications of current AI in assisting analysis of clusters of related scientific articles, focusing on causal claims in complex domains such as epidemiology, toxicology, and risk analysis. Building on an earlier AI assistant, AIA1, which critically reviewed causal claims in individual papers, AIA2 advances the approach by systematically comparing multiple studies to identify areas of agreement and disagreement, suggest explanations for differences in conclusions, flag methodological gaps and inconsistencies, synthesize and summarize well-supported conclusions despite conflicts, and propose recommendations to help resolve knowledge gaps. We illustrate these capabilities with a case study of formaldehyde exposure and leukemia using a cluster of four papers that feature very different approaches and partly conflicting conclusions. AIA2 successfully identifies major points of agreement and contention, discusses the robustness of the evidence for causal claims, and recommends future research directions to address current uncertainties. AIA2's outputs suggest that current AI can offer a promising, practicable approach to AI-assisted review of clusters of papers, promoting methodological rigor, thoroughness, and transparency in review and synthesis, notwithstanding current limitations of LLMs. We discuss the implications of AI-assisted literature review systems for improving evidence-based decision-making, resolving conflicting scientific claims, and promoting rigor and reproducibility in causal research and health risk analysis.</div></div>","PeriodicalId":36311,"journal":{"name":"Global Epidemiology","volume":"10 ","pages":"Article 100207"},"PeriodicalIF":0.0000,"publicationDate":"2025-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Global Epidemiology","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2590113325000252","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Current large language models (LLMs) face significant challenges in attempting to synthesize and critically assess conflicting causal claims in scientific literature about exposure-associated health effects. This paper examines the design and performance of AIA2, an experimental AI system (freely available at http://cloud.cox-associates.com/) designed to help explore and illustrate potential applications of current AI in assisting analysis of clusters of related scientific articles, focusing on causal claims in complex domains such as epidemiology, toxicology, and risk analysis. Building on an earlier AI assistant, AIA1, which critically reviewed causal claims in individual papers, AIA2 advances the approach by systematically comparing multiple studies to identify areas of agreement and disagreement, suggest explanations for differences in conclusions, flag methodological gaps and inconsistencies, synthesize and summarize well-supported conclusions despite conflicts, and propose recommendations to help resolve knowledge gaps. We illustrate these capabilities with a case study of formaldehyde exposure and leukemia using a cluster of four papers that feature very different approaches and partly conflicting conclusions. AIA2 successfully identifies major points of agreement and contention, discusses the robustness of the evidence for causal claims, and recommends future research directions to address current uncertainties. AIA2's outputs suggest that current AI can offer a promising, practicable approach to AI-assisted review of clusters of papers, promoting methodological rigor, thoroughness, and transparency in review and synthesis, notwithstanding current limitations of LLMs. We discuss the implications of AI-assisted literature review systems for improving evidence-based decision-making, resolving conflicting scientific claims, and promoting rigor and reproducibility in causal research and health risk analysis.