Alejandro García-Rudolph, David Sanchez-Pinsach, Mira Caridad Fernandez, Sandra Cunyat, Eloy Opisso, Elena Hernandez-Pena
{"title":"How Chatbots Respond to NCLEX-RN Practice Questions: Assessment of Google Gemini, GPT-3.5, and GPT-4.","authors":"Alejandro García-Rudolph, David Sanchez-Pinsach, Mira Caridad Fernandez, Sandra Cunyat, Eloy Opisso, Elena Hernandez-Pena","doi":"10.1097/01.NEP.0000000000001364","DOIUrl":null,"url":null,"abstract":"<p><strong>Abstract: </strong>ChatGPT often \"hallucinates\" or misleads, underscoring the need for formal validation at the professional level for reliable use in nursing education. We evaluated two free chatbots (Google Gemini and GPT-3.5) and a commercial version (GPT-4) on 250 standardized questions from a simulated nursing licensure exam, which closely matches the content and complexity of the actual exam. Gemini achieved 73.2 percent (183/250), GPT-3.5 achieved 72 percent (180/250), and GPT-4 reached a notably higher performance with 92.4 percent (231/250). GPT-4 exhibited its highest error rate (13.3%) in the psychosocial integrity category.</p>","PeriodicalId":47651,"journal":{"name":"Nursing Education Perspectives","volume":" ","pages":"E18-E20"},"PeriodicalIF":0.9000,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nursing Education Perspectives","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1097/01.NEP.0000000000001364","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/12/18 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"EDUCATION & EDUCATIONAL RESEARCH","Score":null,"Total":0}
引用次数: 0
Abstract
Abstract: ChatGPT often "hallucinates" or misleads, underscoring the need for formal validation at the professional level for reliable use in nursing education. We evaluated two free chatbots (Google Gemini and GPT-3.5) and a commercial version (GPT-4) on 250 standardized questions from a simulated nursing licensure exam, which closely matches the content and complexity of the actual exam. Gemini achieved 73.2 percent (183/250), GPT-3.5 achieved 72 percent (180/250), and GPT-4 reached a notably higher performance with 92.4 percent (231/250). GPT-4 exhibited its highest error rate (13.3%) in the psychosocial integrity category.
期刊介绍:
A publication of the National League for Nursing, Nursing Education Perspectives is a peer-reviewed, bimonthly journal that provides evidence for best practices in nursing education. Through the publication of rigorously designed studies, the journal contributes to the advancement of the science of nursing education. It serves as a forum for research and innovation regarding teaching and learning, curricula, technology, and other issues important to nursing education. Today, as nurse educators strive to advance research in nursing education and break away from established patterns and chart new pathways in nursing education, Nursing Education Perspectives is a vital resource. Nursing Education Perspectives is housed in the NLN Chamberlain College of Nursing for the Advancement of the Science of Nursing Education.