ChatGPT's progress over time: A longitudinal enhancing biostatistical problem-solving in medical education.

IF 2.3 3区医学 Q2 HEALTH CARE SCIENCES & SERVICES

Health Informatics Journal Pub Date : 2025-07-01 Epub Date: 2025-09-19 DOI:10.1177/14604582251381260

Aleksandra Ignjatović, Marija Anđelković Apostolović, Lazar Stevanović, Pavle Radovanović, Marija Topalović, Tamara Filipović, Suzana Otašević

{"title":"ChatGPT's progress over time: A longitudinal enhancing biostatistical problem-solving in medical education.","authors":"Aleksandra Ignjatović, Marija Anđelković Apostolović, Lazar Stevanović, Pavle Radovanović, Marija Topalović, Tamara Filipović, Suzana Otašević","doi":"10.1177/14604582251381260","DOIUrl":null,"url":null,"abstract":"Objective: ChatGPT has been recognised as a potentially transformative tool in higher education by enhancing the teaching and learning process. Cross-sectional evaluations have acknowledged this potential. This study evaluates ChatGPT's performance in solving specific biostatistical problems, focusing on accuracy, stability, and reproducibility, and explores its potential as a reliable educational tool in medical education. Methods: The correlation analysis task from Statistics at Square One by Swinscow and Campbell was chosen for its foundational role in biostatistics. Between October 2023 and March 2024, and July 2024, GPT-3.5 and GPT-4 were tested for accuracy in 12 parameters. Results: A statistically significant change in correct response rates was established in repeated measurements in the period October 2023, March 2024, and July 2024 for GPT-3.5 (Q = 100.99, p < 0.001), GPT-4.0 (Q = 89.55, p < 0.001), respectively. The significant GPT-3.5 improvement was established between March 2024/July 2024 (p = 0.004), and between October 2023 and July 2024 (p = 0.008). The significant GPT-4.0 improvement was established between October 2023 and March 2024 (p = 0.004), and between October 2023 and July 2024 (p = 0.026). Conclusion: Over 9 months, GPT-4 demonstrated rapid and consistent improvements, achieving perfect accuracy by March 2024. Although this study documented ChatGPT's advancement within 9 months, ChatGPT should be positioned as a supplementary tool in higher education classrooms, in the presence of educators, to enhance the learning process.","PeriodicalId":55069,"journal":{"name":"Health Informatics Journal","volume":"31 3","pages":"14604582251381260"},"PeriodicalIF":2.3000,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Health Informatics Journal","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1177/14604582251381260","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/9/19 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}

引用次数: 0

Abstract

Objective: ChatGPT has been recognised as a potentially transformative tool in higher education by enhancing the teaching and learning process. Cross-sectional evaluations have acknowledged this potential. This study evaluates ChatGPT's performance in solving specific biostatistical problems, focusing on accuracy, stability, and reproducibility, and explores its potential as a reliable educational tool in medical education. Methods: The correlation analysis task from Statistics at Square One by Swinscow and Campbell was chosen for its foundational role in biostatistics. Between October 2023 and March 2024, and July 2024, GPT-3.5 and GPT-4 were tested for accuracy in 12 parameters. Results: A statistically significant change in correct response rates was established in repeated measurements in the period October 2023, March 2024, and July 2024 for GPT-3.5 (Q = 100.99, p < 0.001), GPT-4.0 (Q = 89.55, p < 0.001), respectively. The significant GPT-3.5 improvement was established between March 2024/July 2024 (p = 0.004), and between October 2023 and July 2024 (p = 0.008). The significant GPT-4.0 improvement was established between October 2023 and March 2024 (p = 0.004), and between October 2023 and July 2024 (p = 0.026). Conclusion: Over 9 months, GPT-4 demonstrated rapid and consistent improvements, achieving perfect accuracy by March 2024. Although this study documented ChatGPT's advancement within 9 months, ChatGPT should be positioned as a supplementary tool in higher education classrooms, in the presence of educators, to enhance the learning process.

查看原文本刊更多论文

ChatGPT随时间的进展：纵向加强医学教育中生物统计学问题的解决。

目的：ChatGPT已被认为是高等教育中一种潜在的变革性工具，可以增强教学过程。横断面评价已经承认了这种潜力。本研究评估ChatGPT在解决特定生物统计问题方面的表现，重点关注准确性、稳定性和可重复性，并探索其作为医学教育可靠教育工具的潜力。方法：选择Swinscow和Campbell的《统计学在第一步》中的相关分析任务，因为它在生物统计学中具有基础作用。在2023年10月至2024年3月和2024年7月期间，测试了GPT-3.5和GPT-4在12个参数中的准确性。结果：在2023年10月、2024年3月和2024年7月，GPT-3.5 （Q = 100.99, p < 0.001）和GPT-4.0 （Q = 89.55, p < 0.001）的重复测量中，正确反应率分别有统计学意义的变化。GPT-3.5在2024年3月至2024年7月期间（p = 0.004）和2023年10月至2024年7月期间（p = 0.008）均有显著改善。GPT-4.0在2023年10月至2024年3月期间（p = 0.004）和2023年10月至2024年7月期间（p = 0.026）均有显著改善。结论：在9个月的时间里，GPT-4表现出快速和持续的改善，到2024年3月达到完美的准确性。虽然这项研究记录了ChatGPT在9个月内的进步，但ChatGPT应该被定位为高等教育课堂上的辅助工具，在教育者面前，以增强学习过程。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Health Informatics Journal HEALTH CARE SCIENCES & SERVICES-MEDICAL INFORMATICS

CiteScore

7.80

自引率

6.70%

发文量

审稿时长

6 months

期刊介绍： Health Informatics Journal is an international peer-reviewed journal. All papers submitted to Health Informatics Journal are subject to peer review by members of a carefully appointed editorial board. The journal operates a conventional single-blind reviewing policy in which the reviewer’s name is always concealed from the submitting author.