Jakub Kufel, Iga Paszkiewicz, Michał Bielówka, Wiktoria Bartnikowska, Michał Janik, Magdalena Stencel, Łukasz Czogalik, Katarzyna Gruszczyńska, Sylwia Mielcarska
{"title":"ChatGPT会通过波兰放射学和诊断成像专业考试吗?洞察优势和局限性。","authors":"Jakub Kufel, Iga Paszkiewicz, Michał Bielówka, Wiktoria Bartnikowska, Michał Janik, Magdalena Stencel, Łukasz Czogalik, Katarzyna Gruszczyńska, Sylwia Mielcarska","doi":"10.5114/pjr.2023.131215","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>Rapid development of artificial intelligence has aroused curiosity regarding its potential applications in medical field. The purpose of this article was to present the performance of ChatGPT, a state-of-the-art language model in relation to pass rate of national specialty examination (PES) in radiology and imaging diagnostics within Polish education system. Additionally, the study aimed to identify the strengths and limitations of the model through a detailed analysis of issues raised by exam questions.</p><p><strong>Material and methods: </strong>The present study utilized a PES exam consisting of 120 questions, provided by Medical Exami-nations Center in Lodz. Questions were administered using openai.com platform that grants free access to GPT-3.5 model. All questions were categorized according to Bloom's taxonomy to assess their complexity and difficulty. Following the answer to each exam question, ChatGPT was asked to rate its confidence on a scale of 1 to 5 to evaluate the accuracy of its response.</p><p><strong>Results: </strong>ChatGPT did not reach the pass rate threshold of PES exam (52%); however, it was close in certain question categories. No significant differences were observed in the percentage of correct answers across question types and sub-types.</p><p><strong>Conclusions: </strong>The performance of the ChatGPT model in the pass rate of PES exam in radiology and imaging diagnostics in Poland is yet to be determined, which requires further research on improved versions of ChatGPT.</p>","PeriodicalId":94174,"journal":{"name":"Polish journal of radiology","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/e4/61/PJR-88-51387.PMC10551734.pdf","citationCount":"0","resultStr":"{\"title\":\"Will ChatGPT pass the Polish specialty exam in radiology and diagnostic imaging? Insights into strengths and limitations.\",\"authors\":\"Jakub Kufel, Iga Paszkiewicz, Michał Bielówka, Wiktoria Bartnikowska, Michał Janik, Magdalena Stencel, Łukasz Czogalik, Katarzyna Gruszczyńska, Sylwia Mielcarska\",\"doi\":\"10.5114/pjr.2023.131215\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Purpose: </strong>Rapid development of artificial intelligence has aroused curiosity regarding its potential applications in medical field. The purpose of this article was to present the performance of ChatGPT, a state-of-the-art language model in relation to pass rate of national specialty examination (PES) in radiology and imaging diagnostics within Polish education system. Additionally, the study aimed to identify the strengths and limitations of the model through a detailed analysis of issues raised by exam questions.</p><p><strong>Material and methods: </strong>The present study utilized a PES exam consisting of 120 questions, provided by Medical Exami-nations Center in Lodz. Questions were administered using openai.com platform that grants free access to GPT-3.5 model. All questions were categorized according to Bloom's taxonomy to assess their complexity and difficulty. Following the answer to each exam question, ChatGPT was asked to rate its confidence on a scale of 1 to 5 to evaluate the accuracy of its response.</p><p><strong>Results: </strong>ChatGPT did not reach the pass rate threshold of PES exam (52%); however, it was close in certain question categories. No significant differences were observed in the percentage of correct answers across question types and sub-types.</p><p><strong>Conclusions: </strong>The performance of the ChatGPT model in the pass rate of PES exam in radiology and imaging diagnostics in Poland is yet to be determined, which requires further research on improved versions of ChatGPT.</p>\",\"PeriodicalId\":94174,\"journal\":{\"name\":\"Polish journal of radiology\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-09-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/e4/61/PJR-88-51387.PMC10551734.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Polish journal of radiology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5114/pjr.2023.131215\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2023/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Polish journal of radiology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5114/pjr.2023.131215","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/1/1 0:00:00","PubModel":"eCollection","JCR":"","JCRName":"","Score":null,"Total":0}
Will ChatGPT pass the Polish specialty exam in radiology and diagnostic imaging? Insights into strengths and limitations.
Purpose: Rapid development of artificial intelligence has aroused curiosity regarding its potential applications in medical field. The purpose of this article was to present the performance of ChatGPT, a state-of-the-art language model in relation to pass rate of national specialty examination (PES) in radiology and imaging diagnostics within Polish education system. Additionally, the study aimed to identify the strengths and limitations of the model through a detailed analysis of issues raised by exam questions.
Material and methods: The present study utilized a PES exam consisting of 120 questions, provided by Medical Exami-nations Center in Lodz. Questions were administered using openai.com platform that grants free access to GPT-3.5 model. All questions were categorized according to Bloom's taxonomy to assess their complexity and difficulty. Following the answer to each exam question, ChatGPT was asked to rate its confidence on a scale of 1 to 5 to evaluate the accuracy of its response.
Results: ChatGPT did not reach the pass rate threshold of PES exam (52%); however, it was close in certain question categories. No significant differences were observed in the percentage of correct answers across question types and sub-types.
Conclusions: The performance of the ChatGPT model in the pass rate of PES exam in radiology and imaging diagnostics in Poland is yet to be determined, which requires further research on improved versions of ChatGPT.