Emergency Medicine Assistants in the Field of Toxicology, Comparison of ChatGPT-3.5 and GEMINI Artificial Intelligence Systems.

Q3 Medicine

Acta Medica Lituanica Pub Date : 2024-01-01 Epub Date: 2024-12-04 DOI:10.15388/Amed.2024.31.2.18

Hatice Aslı Bedel, Cihan Bedel, Fatih Selvi, Ökkeş Zortuk, Yusuf Karancı

{"title":"Emergency Medicine Assistants in the Field of Toxicology, Comparison of ChatGPT-3.5 and GEMINI Artificial Intelligence Systems.","authors":"Hatice Aslı Bedel, Cihan Bedel, Fatih Selvi, Ökkeş Zortuk, Yusuf Karancı","doi":"10.15388/Amed.2024.31.2.18","DOIUrl":null,"url":null,"abstract":"Objective: Artificial intelligence models human thinking and problem-solving abilities, allowing computers to make autonomous decisions. There is a lack of studies demonstrating the clinical utility of GPT and Gemin in the field of toxicology, which means their level of competence is not well understood. This study compares the responses given by GPT-3.5 and Gemin to those provided by emergency medicine residents.Methods: This prospective study was focused on toxicology and utilized the widely recognized educational resource 'Tintinalli Emergency Medicine: A Comprehensive Study Guide' for the field of Emergency Medicine. A set of twenty questions, each with five options, was devised to test knowledge of toxicological data as defined in the book. These questions were then used to train ChatGPT GPT-3.5 (Generative Pre-trained Transformer 3.5) by OpenAI and Gemini by Google AI in the clinic. The resulting answers were then meticulously analyzed.Results: 28 physicians, 35.7% of whom were women, were included in our study. A comparison was made between the physician and AI scores. While a significant difference was found in the comparison (F=2.368 and p<0.001), no significant difference was found between the two groups in the post-hoc Tukey test. GPT-3.5 mean score is 9.9±0.71, Gemini mean score is 11.30±1.17 and, physicians' mean score is 9.82±3.70 (Figure 1).Conclusions: It is clear that GPT-3.5 and Gemini respond similarly to topics in toxicology, just as resident physicians do.","PeriodicalId":34365,"journal":{"name":"Acta Medica Lituanica","volume":"31 2","pages":"294-301"},"PeriodicalIF":0.0000,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11887820/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Acta Medica Lituanica","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.15388/Amed.2024.31.2.18","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/12/4 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"Medicine","Score":null,"Total":0}

引用次数: 0

Abstract

Objective: Artificial intelligence models human thinking and problem-solving abilities, allowing computers to make autonomous decisions. There is a lack of studies demonstrating the clinical utility of GPT and Gemin in the field of toxicology, which means their level of competence is not well understood. This study compares the responses given by GPT-3.5 and Gemin to those provided by emergency medicine residents.

Methods: This prospective study was focused on toxicology and utilized the widely recognized educational resource 'Tintinalli Emergency Medicine: A Comprehensive Study Guide' for the field of Emergency Medicine. A set of twenty questions, each with five options, was devised to test knowledge of toxicological data as defined in the book. These questions were then used to train ChatGPT GPT-3.5 (Generative Pre-trained Transformer 3.5) by OpenAI and Gemini by Google AI in the clinic. The resulting answers were then meticulously analyzed.

Results: 28 physicians, 35.7% of whom were women, were included in our study. A comparison was made between the physician and AI scores. While a significant difference was found in the comparison (F=2.368 and p<0.001), no significant difference was found between the two groups in the post-hoc Tukey test. GPT-3.5 mean score is 9.9±0.71, Gemini mean score is 11.30±1.17 and, physicians' mean score is 9.82±3.70 (Figure 1).

Conclusions: It is clear that GPT-3.5 and Gemini respond similarly to topics in toxicology, just as resident physicians do.

查看原文本刊更多论文

毒物学领域的急诊医学助理，ChatGPT-3.5和GEMINI人工智能系统的比较。

目的：人工智能模拟人类的思维和解决问题的能力，使计算机能够自主决策。缺乏证明GPT和Gemin在毒理学领域的临床应用的研究，这意味着它们的能力水平还没有得到很好的了解。本研究比较了GPT-3.5和Gemin给出的回答与急诊住院医师提供的回答。方法：本前瞻性研究以毒理学为重点，利用急诊医学领域广泛认可的教育资源“Tintinalli急诊医学：综合学习指南”。一套20个问题，每个有5个选项，被设计来测试知识的毒理学数据的定义在书中。然后用这些问题在临床中训练OpenAI的ChatGPT GPT-3.5（生成预训练的Transformer 3.5）和b谷歌AI的Gemini。然后对得到的答案进行细致的分析。结果：我们的研究纳入了28名医生，其中35.7%为女性。将医生和AI评分进行比较。结论：很明显，GPT-3.5和Gemini对毒理学主题的反应相似，就像住院医师一样。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊