Nicolò Gilardi, Massimo Ballabio, Francesco Ravera, Lorenzo Ferrando, Mario Stabile, Andrea Bellodi, Giovanni Talerico, Benedetta Cigolini, Carlo Genova, Federico Carbone, Fabrizio Montecucco, Christian Bracco, Alberto Ballestrero, Gabriele Zoppoli
{"title":"医学教育背景对内科ChatGPT-4应答诊断质量的影响:一项初步研究","authors":"Nicolò Gilardi, Massimo Ballabio, Francesco Ravera, Lorenzo Ferrando, Mario Stabile, Andrea Bellodi, Giovanni Talerico, Benedetta Cigolini, Carlo Genova, Federico Carbone, Fabrizio Montecucco, Christian Bracco, Alberto Ballestrero, Gabriele Zoppoli","doi":"10.1111/eci.70113","DOIUrl":null,"url":null,"abstract":"<p><p>This pilot study evaluated the influence of medical background on the diagnostic quality of ChatGPT-4's responses in Internal Medicine. Third-year students, residents and specialists summarised five complex NEJM clinical cases before querying ChatGPT-4. Diagnostic ranking, assessed by independent experts, revealed that residents significantly outperformed students (OR 2.33, p = .007); though overall performance was low. These findings indicate that user expertise and concise case summaries are critical for optimising AI diagnostics, highlighting the need for enhanced AI training and user interaction strategies.</p>","PeriodicalId":12013,"journal":{"name":"European Journal of Clinical Investigation","volume":" ","pages":"e70113"},"PeriodicalIF":3.6000,"publicationDate":"2025-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Influence of medical educational background on the diagnostic quality of ChatGPT-4 responses in internal medicine: A pilot study.\",\"authors\":\"Nicolò Gilardi, Massimo Ballabio, Francesco Ravera, Lorenzo Ferrando, Mario Stabile, Andrea Bellodi, Giovanni Talerico, Benedetta Cigolini, Carlo Genova, Federico Carbone, Fabrizio Montecucco, Christian Bracco, Alberto Ballestrero, Gabriele Zoppoli\",\"doi\":\"10.1111/eci.70113\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>This pilot study evaluated the influence of medical background on the diagnostic quality of ChatGPT-4's responses in Internal Medicine. Third-year students, residents and specialists summarised five complex NEJM clinical cases before querying ChatGPT-4. Diagnostic ranking, assessed by independent experts, revealed that residents significantly outperformed students (OR 2.33, p = .007); though overall performance was low. These findings indicate that user expertise and concise case summaries are critical for optimising AI diagnostics, highlighting the need for enhanced AI training and user interaction strategies.</p>\",\"PeriodicalId\":12013,\"journal\":{\"name\":\"European Journal of Clinical Investigation\",\"volume\":\" \",\"pages\":\"e70113\"},\"PeriodicalIF\":3.6000,\"publicationDate\":\"2025-09-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"European Journal of Clinical Investigation\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1111/eci.70113\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MEDICINE, GENERAL & INTERNAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Journal of Clinical Investigation","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1111/eci.70113","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MEDICINE, GENERAL & INTERNAL","Score":null,"Total":0}
引用次数: 0
摘要
本初步研究评估了医学背景对ChatGPT-4在内科诊断质量的影响。在查询ChatGPT-4之前,三年级学生、住院医师和专家总结了五个复杂的NEJM临床病例。由独立专家评估的诊断排名显示,住院医生的表现明显优于学生(OR 2.33, p = .007);尽管整体表现不佳。这些发现表明,用户专业知识和简明的案例摘要对于优化人工智能诊断至关重要,强调了加强人工智能培训和用户交互策略的必要性。
Influence of medical educational background on the diagnostic quality of ChatGPT-4 responses in internal medicine: A pilot study.
This pilot study evaluated the influence of medical background on the diagnostic quality of ChatGPT-4's responses in Internal Medicine. Third-year students, residents and specialists summarised five complex NEJM clinical cases before querying ChatGPT-4. Diagnostic ranking, assessed by independent experts, revealed that residents significantly outperformed students (OR 2.33, p = .007); though overall performance was low. These findings indicate that user expertise and concise case summaries are critical for optimising AI diagnostics, highlighting the need for enhanced AI training and user interaction strategies.
期刊介绍:
EJCI considers any original contribution from the most sophisticated basic molecular sciences to applied clinical and translational research and evidence-based medicine across a broad range of subspecialties. The EJCI publishes reports of high-quality research that pertain to the genetic, molecular, cellular, or physiological basis of human biology and disease, as well as research that addresses prevalence, diagnosis, course, treatment, and prevention of disease. We are primarily interested in studies directly pertinent to humans, but submission of robust in vitro and animal work is also encouraged. Interdisciplinary work and research using innovative methods and combinations of laboratory, clinical, and epidemiological methodologies and techniques is of great interest to the journal. Several categories of manuscripts (for detailed description see below) are considered: editorials, original articles (also including randomized clinical trials, systematic reviews and meta-analyses), reviews (narrative reviews), opinion articles (including debates, perspectives and commentaries); and letters to the Editor.