Peng Wu , Fuxiao Li , Yuanxi Jia , Jiaqian Yin , Yubing Shen , Yanxiao Gao , Ying Li , Feng Sha , Zhirong Yang , Jinling Tang
{"title":"Randomized controlled trials evaluating large language models in digestive diseases: a scoping review","authors":"Peng Wu , Fuxiao Li , Yuanxi Jia , Jiaqian Yin , Yubing Shen , Yanxiao Gao , Ying Li , Feng Sha , Zhirong Yang , Jinling Tang","doi":"10.1016/j.gande.2025.09.003","DOIUrl":null,"url":null,"abstract":"<div><div>This scoping review summarizes the current landscape of randomized controlled trials (RCTs) evaluating the use of large language models (LLMs) in digestive diseases. We conducted a systematic search of PubMed, Web of Science, Scopus, and CENTRAL for published RCTs, and of ClinicalTrials.gov and the International Clinical Trials Registry Platform (ICTRP) for ongoing trials. We included RCTs on digestive diseases in which LLMs were the primary intervention. A total of four published and ten ongoing RCTs were identified. Most trials were conducted in China and the United States, primarily as single-country, single-center studies. Gastrointestinal diseases were the main focus, followed by hepatobiliary conditions. The assessed LLM applications predominantly supported clinical decision-making and patient education, with question answering emerging as the most common natural language processing task. Notably, the trials were relatively balanced in their use of general-purpose versus domain-specific LLMs, reflecting diverse strategies in model deployment. The most common study design involved comparisons between LLM-assisted and unassisted approaches, with primary outcomes centered on various aspects of care management. The main limitation of this review is the relatively small number of available studies. Despite this, the identified trials highlight the promising potential of LLM applications in digestive diseases. Further international, multicenter, well-reported RCTs with a focus on real patient outcomes are urgently needed to confirm the usefulness of LLMs in clinical practice.</div></div>","PeriodicalId":100571,"journal":{"name":"Gastroenterology & Endoscopy","volume":"3 4","pages":"Pages 232-240"},"PeriodicalIF":0.0000,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Gastroenterology & Endoscopy","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2949752325000792","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This scoping review summarizes the current landscape of randomized controlled trials (RCTs) evaluating the use of large language models (LLMs) in digestive diseases. We conducted a systematic search of PubMed, Web of Science, Scopus, and CENTRAL for published RCTs, and of ClinicalTrials.gov and the International Clinical Trials Registry Platform (ICTRP) for ongoing trials. We included RCTs on digestive diseases in which LLMs were the primary intervention. A total of four published and ten ongoing RCTs were identified. Most trials were conducted in China and the United States, primarily as single-country, single-center studies. Gastrointestinal diseases were the main focus, followed by hepatobiliary conditions. The assessed LLM applications predominantly supported clinical decision-making and patient education, with question answering emerging as the most common natural language processing task. Notably, the trials were relatively balanced in their use of general-purpose versus domain-specific LLMs, reflecting diverse strategies in model deployment. The most common study design involved comparisons between LLM-assisted and unassisted approaches, with primary outcomes centered on various aspects of care management. The main limitation of this review is the relatively small number of available studies. Despite this, the identified trials highlight the promising potential of LLM applications in digestive diseases. Further international, multicenter, well-reported RCTs with a focus on real patient outcomes are urgently needed to confirm the usefulness of LLMs in clinical practice.
本综述综述了评估大语言模型(LLMs)在消化系统疾病中的应用的随机对照试验(rct)的现状。我们系统地检索了PubMed、Web of Science、Scopus和CENTRAL的已发表的随机对照试验,以及ClinicalTrials.gov和国际临床试验注册平台(ICTRP)的正在进行的试验。我们纳入了以llm为主要干预措施的消化系统疾病的随机对照试验。共确定了4项已发表的随机对照试验和10项正在进行的随机对照试验。大多数试验在中国和美国进行,主要是单国家、单中心研究。胃肠道疾病是主要的焦点,其次是肝胆疾病。评估的法学硕士应用程序主要支持临床决策和患者教育,回答问题成为最常见的自然语言处理任务。值得注意的是,这些试验在使用通用法学硕士和特定领域法学硕士方面相对平衡,反映了模型部署中的不同策略。最常见的研究设计包括比较llm辅助和非辅助方法,主要结果集中在护理管理的各个方面。本综述的主要局限性是现有研究的数量相对较少。尽管如此,确定的试验突出了LLM在消化系统疾病中的应用潜力。迫切需要进一步的国际、多中心、报道良好的rct,关注真实的患者结果,以确认llm在临床实践中的有用性。