Using ChatGPT in Psychiatry to Design Script Concordance Tests in Undergraduate Medical Education: Mixed Methods Study

IF 3.2 Q1 EDUCATION, SCIENTIFIC DISCIPLINES

JMIR Medical Education Pub Date : 2024-04-04 DOI:10.2196/54067

Alexandre Hudon, Barnabé Kiepura, Myriam Pelletier, Véronique Phan

{"title":"Using ChatGPT in Psychiatry to Design Script Concordance Tests in Undergraduate Medical Education: Mixed Methods Study","authors":"Alexandre Hudon, Barnabé Kiepura, Myriam Pelletier, Véronique Phan","doi":"10.2196/54067","DOIUrl":null,"url":null,"abstract":"Abstract Background Undergraduate medical studies represent a wide range of learning opportunities served in the form of various teaching-learning modalities for medical learners. A clinical scenario is frequently used as a modality, followed by multiple-choice and open-ended questions among other learning and teaching methods. As such, script concordance tests (SCTs) can be used to promote a higher level of clinical reasoning. Recent technological developments have made generative artificial intelligence (AI)–based systems such as ChatGPT (OpenAI) available to assist clinician-educators in creating instructional materials. Objective The main objective of this project is to explore how SCTs generated by ChatGPT compared to SCTs produced by clinical experts on 3 major elements: the scenario (stem), clinical questions, and expert opinion. Methods This mixed method study evaluated 3 ChatGPT-generated SCTs with 3 expert-created SCTs using a predefined framework. Clinician-educators as well as resident doctors in psychiatry involved in undergraduate medical education in Quebec, Canada, evaluated via a web-based survey the 6 SCTs on 3 criteria: the scenario, clinical questions, and expert opinion. They were also asked to describe the strengths and weaknesses of the SCTs. Results A total of 102 respondents assessed the SCTs. There were no significant distinctions between the 2 types of SCTs concerning the scenario (P=.84), clinical questions (P=.99), and expert opinion (P=.07), as interpretated by the respondents. Indeed, respondents struggled to differentiate between ChatGPT- and expert-generated SCTs. ChatGPT showcased promise in expediting SCT design, aligning well with Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition criteria, albeit with a tendency toward caricatured scenarios and simplistic content. Conclusions This study is the first to concentrate on the design of SCTs supported by AI in a period where medicine is changing swiftly and where technologies generated from AI are expanding much faster. This study suggests that ChatGPT can be a valuable tool in creating educational materials, and further validation is essential to ensure educational efficacy and accuracy.","PeriodicalId":36236,"journal":{"name":"JMIR Medical Education","volume":null,"pages":null},"PeriodicalIF":3.2000,"publicationDate":"2024-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JMIR Medical Education","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2196/54067","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EDUCATION, SCIENTIFIC DISCIPLINES","Score":null,"Total":0}

引用次数: 0

Abstract

Abstract Background Undergraduate medical studies represent a wide range of learning opportunities served in the form of various teaching-learning modalities for medical learners. A clinical scenario is frequently used as a modality, followed by multiple-choice and open-ended questions among other learning and teaching methods. As such, script concordance tests (SCTs) can be used to promote a higher level of clinical reasoning. Recent technological developments have made generative artificial intelligence (AI)–based systems such as ChatGPT (OpenAI) available to assist clinician-educators in creating instructional materials. Objective The main objective of this project is to explore how SCTs generated by ChatGPT compared to SCTs produced by clinical experts on 3 major elements: the scenario (stem), clinical questions, and expert opinion. Methods This mixed method study evaluated 3 ChatGPT-generated SCTs with 3 expert-created SCTs using a predefined framework. Clinician-educators as well as resident doctors in psychiatry involved in undergraduate medical education in Quebec, Canada, evaluated via a web-based survey the 6 SCTs on 3 criteria: the scenario, clinical questions, and expert opinion. They were also asked to describe the strengths and weaknesses of the SCTs. Results A total of 102 respondents assessed the SCTs. There were no significant distinctions between the 2 types of SCTs concerning the scenario (P=.84), clinical questions (P=.99), and expert opinion (P=.07), as interpretated by the respondents. Indeed, respondents struggled to differentiate between ChatGPT- and expert-generated SCTs. ChatGPT showcased promise in expediting SCT design, aligning well with Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition criteria, albeit with a tendency toward caricatured scenarios and simplistic content. Conclusions This study is the first to concentrate on the design of SCTs supported by AI in a period where medicine is changing swiftly and where technologies generated from AI are expanding much faster. This study suggests that ChatGPT can be a valuable tool in creating educational materials, and further validation is essential to ensure educational efficacy and accuracy.

查看原文本刊更多论文

使用精神病学中的 ChatGPT 在本科医学教育中设计脚本一致性测试：混合方法研究

摘要背景医学本科学习是以各种教学模式为医学学习者提供的广泛学习机会。临床情景经常被用作一种教学模式，其次是多项选择题和开放式问题等其他学习和教学方法。因此，脚本一致性测试（SCT）可用于促进更高水平的临床推理。最近的技术发展使基于人工智能（AI）的生成系统（如 ChatGPT (OpenAI)）可用来协助临床教师创建教学材料。目的本项目的主要目的是探讨 ChatGPT 生成的 SCT 与临床专家生成的 SCT 在场景（题干）、临床问题和专家意见这三个主要要素上的比较。方法这项混合方法研究采用预定义框架，对 3 个由 ChatGPT 生成的 SCT 和 3 个由专家生成的 SCT 进行了评估。加拿大魁北克省参与本科医学教育的临床教育工作者和精神病学住院医生通过网络调查，根据情景、临床问题和专家意见这三个标准对 6 个 SCT 进行了评估。他们还被要求描述 SCT 的优缺点。结果共有 102 位受访者对 SCT 进行了评估。根据受访者的解释，两类 SCT 在情景（P=.84）、临床问题（P=.99）和专家意见（P=.07）方面没有明显区别。事实上，受访者很难区分 ChatGPT 和专家生成的 SCT。ChatGPT 在加速 SCT 设计方面大有可为，与《精神疾病诊断与统计手册》第五版的标准非常吻合，尽管有漫画化场景和内容简单化的倾向。结论在医学日新月异、人工智能技术飞速发展的今天，本研究首次集中探讨了在人工智能支持下的 SCT 设计。这项研究表明，ChatGPT 可以成为创建教育材料的重要工具，而进一步的验证对于确保教育效果和准确性至关重要。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

JMIR Medical Education Social Sciences-Education

CiteScore

6.90

自引率

5.60%

发文量

审稿时长

8 weeks