{"title":"Evaluation of AI-Based Chatbots in Liver Cancer Information Dissemination: A Comparative Analysis of GPT, DeepSeek, Copilot, and Gemini.","authors":"Mustafa Karaagac, Sedat Carkit","doi":"10.1159/000546726","DOIUrl":null,"url":null,"abstract":"<p><strong>Background/objectives: </strong>This study aimed to evaluate AI-based chatbots (GPT, DeepSeek, Copilot, Gemini) in disseminating information on liver cancer, emphasizing content quality, adherence to established guidelines, and ease of comprehension.</p><p><strong>Methods: </strong>Between January and February 2025, four chatbots were examined us-ing publicly accessible free versions lacking independent reasoning capabilities. Three frequently searched Google Trends questions (\"What is liver cancer awareness?\", \"What are the symptoms of liver cancer?\", and \"Is liver cancer treatable?\") were posed. Their responses were assessed via the DISCERN instrument, Cole-man-Liau Index, Patient Education Materials Assessment Tool for Print, and alignment with American Asso-ciation for the Study of Liver Diseases, National Comprehensive Cancer Network, and European Society for Medical Oncology recommendations. Statistical analysis was performed using SPSS 22.</p><p><strong>Results: </strong>All chatbots largely provided relevant and impartial information. GPT and DeepSeek scored lower on specifying infor-mation sources and update timelines, whereas Copilot omitted local therapies (e.g., Radiofrequency Ablation, Transarterial Chemoembolization, Transarterial Radioembolization), resulting in reduced scientific accuracy. Gemini and Copilot performed better in \"Understandability,\" while GPT and DeepSeek excelled in \"Actiona-bility.\" Although GPT demonstrated consistency across multiple treatment options, it did not explicitly refer-ence international guidelines. Study limitations included language constraints, variations in chatbot updates, and reliance on a single inquiry round.</p><p><strong>Conclusions: </strong>AI chatbots show potential as initial informational tools for liver cancer but cannot replace professional medical consultation. In complex diseases requiring multidis-ciplinary management, frequent guideline-based updates, expert validation, and diverse data sources are critical to enhancing clinical relevance and patient outcomes.</p>","PeriodicalId":19497,"journal":{"name":"Oncology","volume":" ","pages":"1-15"},"PeriodicalIF":2.5000,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Oncology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1159/000546726","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Background/objectives: This study aimed to evaluate AI-based chatbots (GPT, DeepSeek, Copilot, Gemini) in disseminating information on liver cancer, emphasizing content quality, adherence to established guidelines, and ease of comprehension.
Methods: Between January and February 2025, four chatbots were examined us-ing publicly accessible free versions lacking independent reasoning capabilities. Three frequently searched Google Trends questions ("What is liver cancer awareness?", "What are the symptoms of liver cancer?", and "Is liver cancer treatable?") were posed. Their responses were assessed via the DISCERN instrument, Cole-man-Liau Index, Patient Education Materials Assessment Tool for Print, and alignment with American Asso-ciation for the Study of Liver Diseases, National Comprehensive Cancer Network, and European Society for Medical Oncology recommendations. Statistical analysis was performed using SPSS 22.
Results: All chatbots largely provided relevant and impartial information. GPT and DeepSeek scored lower on specifying infor-mation sources and update timelines, whereas Copilot omitted local therapies (e.g., Radiofrequency Ablation, Transarterial Chemoembolization, Transarterial Radioembolization), resulting in reduced scientific accuracy. Gemini and Copilot performed better in "Understandability," while GPT and DeepSeek excelled in "Actiona-bility." Although GPT demonstrated consistency across multiple treatment options, it did not explicitly refer-ence international guidelines. Study limitations included language constraints, variations in chatbot updates, and reliance on a single inquiry round.
Conclusions: AI chatbots show potential as initial informational tools for liver cancer but cannot replace professional medical consultation. In complex diseases requiring multidis-ciplinary management, frequent guideline-based updates, expert validation, and diverse data sources are critical to enhancing clinical relevance and patient outcomes.
期刊介绍:
Although laboratory and clinical cancer research need to be closely linked, observations at the basic level often remain removed from medical applications. This journal works to accelerate the translation of experimental results into the clinic, and back again into the laboratory for further investigation. The fundamental purpose of this effort is to advance clinically-relevant knowledge of cancer, and improve the outcome of prevention, diagnosis and treatment of malignant disease. The journal publishes significant clinical studies from cancer programs around the world, along with important translational laboratory findings, mini-reviews (invited and submitted) and in-depth discussions of evolving and controversial topics in the oncology arena. A unique feature of the journal is a new section which focuses on rapid peer-review and subsequent publication of short reports of phase 1 and phase 2 clinical cancer trials, with a goal of insuring that high-quality clinical cancer research quickly enters the public domain, regardless of the trial’s ultimate conclusions regarding efficacy or toxicity.