Aleksandra Ignjatović, Marija Anđelković Apostolović, Lazar Stevanović, Pavle Radovanović, Marija Topalović, Tamara Filipović, Suzana Otašević
{"title":"ChatGPT's progress over time: A longitudinal enhancing biostatistical problem-solving in medical education.","authors":"Aleksandra Ignjatović, Marija Anđelković Apostolović, Lazar Stevanović, Pavle Radovanović, Marija Topalović, Tamara Filipović, Suzana Otašević","doi":"10.1177/14604582251381260","DOIUrl":null,"url":null,"abstract":"<p><p><b>Objective:</b> ChatGPT has been recognised as a potentially transformative tool in higher education by enhancing the teaching and learning process. Cross-sectional evaluations have acknowledged this potential. This study evaluates ChatGPT's performance in solving specific biostatistical problems, focusing on accuracy, stability, and reproducibility, and explores its potential as a reliable educational tool in medical education. <b>Methods:</b> The correlation analysis task from <i>Statistics at Square One</i> by Swinscow and Campbell was chosen for its foundational role in biostatistics. Between October 2023 and March 2024, and July 2024, GPT-3.5 and GPT-4 were tested for accuracy in 12 parameters. <b>Results:</b> A statistically significant change in correct response rates was established in repeated measurements in the period October 2023, March 2024, and July 2024 for GPT-3.5 (Q = 100.99, <i>p</i> < 0.001), GPT-4.0 (Q = 89.55, <i>p</i> < 0.001), respectively. The significant GPT-3.5 improvement was established between March 2024/July 2024 (<i>p</i> = 0.004), and between October 2023 and July 2024 (<i>p</i> = 0.008). The significant GPT-4.0 improvement was established between October 2023 and March 2024 (<i>p</i> = 0.004), and between October 2023 and July 2024 (<i>p</i> = 0.026). <b>Conclusion:</b> Over 9 months, GPT-4 demonstrated rapid and consistent improvements, achieving perfect accuracy by March 2024. Although this study documented ChatGPT's advancement within 9 months, ChatGPT should be positioned as a supplementary tool in higher education classrooms, in the presence of educators, to enhance the learning process.</p>","PeriodicalId":55069,"journal":{"name":"Health Informatics Journal","volume":"31 3","pages":"14604582251381260"},"PeriodicalIF":2.3000,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Health Informatics Journal","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1177/14604582251381260","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/9/19 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0
Abstract
Objective: ChatGPT has been recognised as a potentially transformative tool in higher education by enhancing the teaching and learning process. Cross-sectional evaluations have acknowledged this potential. This study evaluates ChatGPT's performance in solving specific biostatistical problems, focusing on accuracy, stability, and reproducibility, and explores its potential as a reliable educational tool in medical education. Methods: The correlation analysis task from Statistics at Square One by Swinscow and Campbell was chosen for its foundational role in biostatistics. Between October 2023 and March 2024, and July 2024, GPT-3.5 and GPT-4 were tested for accuracy in 12 parameters. Results: A statistically significant change in correct response rates was established in repeated measurements in the period October 2023, March 2024, and July 2024 for GPT-3.5 (Q = 100.99, p < 0.001), GPT-4.0 (Q = 89.55, p < 0.001), respectively. The significant GPT-3.5 improvement was established between March 2024/July 2024 (p = 0.004), and between October 2023 and July 2024 (p = 0.008). The significant GPT-4.0 improvement was established between October 2023 and March 2024 (p = 0.004), and between October 2023 and July 2024 (p = 0.026). Conclusion: Over 9 months, GPT-4 demonstrated rapid and consistent improvements, achieving perfect accuracy by March 2024. Although this study documented ChatGPT's advancement within 9 months, ChatGPT should be positioned as a supplementary tool in higher education classrooms, in the presence of educators, to enhance the learning process.
期刊介绍:
Health Informatics Journal is an international peer-reviewed journal. All papers submitted to Health Informatics Journal are subject to peer review by members of a carefully appointed editorial board. The journal operates a conventional single-blind reviewing policy in which the reviewer’s name is always concealed from the submitting author.