ChatGPT 在葡萄牙国家住院医师准入考试中的表现。

IF 1 4区医学 Q3 MEDICINE, GENERAL & INTERNAL

Acta medica portuguesa Pub Date : 2025-03-03 Epub Date: 2024-12-20 DOI:10.20344/amp.22506

Gonçalo Ferraz-Costa, Mafalda Griné, Manuel Oliveira-Santos, Rogério Teixeira

{"title":"ChatGPT 在葡萄牙国家住院医师准入考试中的表现。","authors":"Gonçalo Ferraz-Costa, Mafalda Griné, Manuel Oliveira-Santos, Rogério Teixeira","doi":"10.20344/amp.22506","DOIUrl":null,"url":null,"abstract":"ChatGPT, a language model developed by OpenAI, has been tested in several medical board examinations. This study aims to evaluate the performance of ChatGPT on the Portuguese National Residency Access Examination, a mandatory test for medical residency in Portugal. The study specifically compares the capabilities of ChatGPT versions 3.5 and 4o across five examination editions from 2019 to 2023. A total of 750 multiple-choice questions were submitted to both versions, and their answers were evaluated against the official responses. The findings revealed that ChatGPT 4o significantly outperformed ChatGPT 3.5, with a median examination score of 127 compared to 106 (p = 0.048). Notably, ChatGPT 4o achieved scores within the top 1% in two examination editions and exceeded the median performance of human candidates in all editions. Additionally, ChatGPT 4o's scores were high enough to qualify for any specialty. In conclusion, ChatGPT 4o can be a valuable tool for medical education and decision-making, but human oversight remains essential to ensure safe and accurate clinical practice.","PeriodicalId":7059,"journal":{"name":"Acta medica portuguesa","volume":" ","pages":"170-174"},"PeriodicalIF":1.0000,"publicationDate":"2025-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Performance of ChatGPT in the Portuguese National Residency Access Examination.\",\"authors\":\"Gonçalo Ferraz-Costa, Mafalda Griné, Manuel Oliveira-Santos, Rogério Teixeira\",\"doi\":\"10.20344/amp.22506\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"ChatGPT, a language model developed by OpenAI, has been tested in several medical board examinations. This study aims to evaluate the performance of ChatGPT on the Portuguese National Residency Access Examination, a mandatory test for medical residency in Portugal. The study specifically compares the capabilities of ChatGPT versions 3.5 and 4o across five examination editions from 2019 to 2023. A total of 750 multiple-choice questions were submitted to both versions, and their answers were evaluated against the official responses. The findings revealed that ChatGPT 4o significantly outperformed ChatGPT 3.5, with a median examination score of 127 compared to 106 (p = 0.048). Notably, ChatGPT 4o achieved scores within the top 1% in two examination editions and exceeded the median performance of human candidates in all editions. Additionally, ChatGPT 4o's scores were high enough to qualify for any specialty. In conclusion, ChatGPT 4o can be a valuable tool for medical education and decision-making, but human oversight remains essential to ensure safe and accurate clinical practice.\",\"PeriodicalId\":7059,\"journal\":{\"name\":\"Acta medica portuguesa\",\"volume\":\" \",\"pages\":\"170-174\"},\"PeriodicalIF\":1.0000,\"publicationDate\":\"2025-03-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Acta medica portuguesa\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.20344/amp.22506\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/12/20 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q3\",\"JCRName\":\"MEDICINE, GENERAL & INTERNAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Acta medica portuguesa","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.20344/amp.22506","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/12/20 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"MEDICINE, GENERAL & INTERNAL","Score":null,"Total":0}

引用次数: 0

摘要

ChatGPT是OpenAI开发的一种语言模型，已经在几次医学委员会考试中得到了测试。本研究旨在评估ChatGPT在葡萄牙国家居留资格考试中的表现，这是葡萄牙医疗居留的强制性测试。该研究特别比较了2019年至2023年五个考试版本的ChatGPT版本3.5和40的功能。两个版本共提交了750道选择题，并将其答案与官方答案进行比较。研究结果显示，ChatGPT 40显著优于ChatGPT 3.5，中位考试分数为127比106 （p = 0.048）。值得注意的是，ChatGPT 40在两个版本的考试中取得了前1%的成绩，并且在所有版本中都超过了人类考生的中位数。此外，ChatGPT 40的分数高到足以胜任任何专业。总之，ChatGPT 40可以成为医学教育和决策的宝贵工具，但人为监督对于确保安全和准确的临床实践仍然至关重要。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Performance of ChatGPT in the Portuguese National Residency Access Examination.

ChatGPT, a language model developed by OpenAI, has been tested in several medical board examinations. This study aims to evaluate the performance of ChatGPT on the Portuguese National Residency Access Examination, a mandatory test for medical residency in Portugal. The study specifically compares the capabilities of ChatGPT versions 3.5 and 4o across five examination editions from 2019 to 2023. A total of 750 multiple-choice questions were submitted to both versions, and their answers were evaluated against the official responses. The findings revealed that ChatGPT 4o significantly outperformed ChatGPT 3.5, with a median examination score of 127 compared to 106 (p = 0.048). Notably, ChatGPT 4o achieved scores within the top 1% in two examination editions and exceeded the median performance of human candidates in all editions. Additionally, ChatGPT 4o's scores were high enough to qualify for any specialty. In conclusion, ChatGPT 4o can be a valuable tool for medical education and decision-making, but human oversight remains essential to ensure safe and accurate clinical practice.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Acta medica portuguesa MEDICINE, GENERAL & INTERNAL-

CiteScore

1.90

自引率

16.70%

发文量

256

审稿时长

6-12 weeks

期刊介绍： The aim of Acta Médica Portuguesa is to publish original research and review articles in biomedical areas of the highest standard, covering several domains of medical knowledge, with the purpose to help doctors improve medical care. In order to accomplish these aims, Acta Médica Portuguesa publishes original articles, review articles, case reports and editorials, among others, with a focus on clinical, scientific, social, political and economic factors affecting health. Acta Médica Portuguesa will be happy to consider manuscripts for publication from authors anywhere in the world.