Artificial Intelligence Meets Item Analysis (AI meets IA): A Study of Chatbot Training and Performance in detecting and correcting MCQ Flaws.

IF 1.2 4区 医学 Q2 MEDICINE, GENERAL & INTERNAL
Mashaal Sabqat, Rehan Ahmed Khan, Masood Jawaid, Madiha Sajjad
{"title":"Artificial Intelligence Meets Item Analysis (AI meets IA): A Study of Chatbot Training and Performance in detecting and correcting MCQ Flaws.","authors":"Mashaal Sabqat, Rehan Ahmed Khan, Masood Jawaid, Madiha Sajjad","doi":"10.12669/pjms.41.3.11224","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>To explore the potential of AI-powered chatbots, specifically ChatGPT, in identifying and correcting flaws in MCQs.</p><p><strong>Methods: </strong>A three-phase-Interventional study was conducted from February to August 2023 at Riphah International University, Islamabad. In Phase-1, flawed MCQs were selected from the NBME guide and fed into ChatGPT. ChatGPT identified item flaws and suggested corrections. In Phase-2, ChatGPT was trained to detect flaws in MCQs with text data from the NBME item writing guide. In Phase-3, ChatGPT was again tested to detect flaws and correct MCQs. Data were analyzed using SPSS, Version 26 and presented using percentages and McNemar's test with exact conditional method.</p><p><strong>Results: </strong>ChatGPT could identify and correct flaws such as use of \"None of the above,\" \"Grammatical cues,\" \"absolute terms,\" and \"inconsistently presented numerical data.\" However, it struggled with flaws related to \"complicated stems,\" \"long or complex options,\" and \"vague frequency terms.\" After training, ChatGPT became better at identifying and correcting flaws related to complicated stems and absolute terms. It also struggled with recognizing \"nonparallel options,\" \"convergence,\" and \"word repetition,\" both before and after training. ChatGPT's performance deteriorated during peak hours. The test of significance showed no measurable increase in ChatGPT's efficiency in detecting item flaws (p = 1.00) and correcting them (p = 0.125).</p><p><strong>Conclusion: </strong>AI is revolutionizing industries and improving efficiency, but limitations exist in complex conversations, analysis, accuracy, and error prevention. Ongoing research is vital to unlocking AI's potential, especially in education.</p>","PeriodicalId":19958,"journal":{"name":"Pakistan Journal of Medical Sciences","volume":"41 3","pages":"652-656"},"PeriodicalIF":1.2000,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11911725/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pakistan Journal of Medical Sciences","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.12669/pjms.41.3.11224","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICINE, GENERAL & INTERNAL","Score":null,"Total":0}
引用次数: 0

Abstract

Objective: To explore the potential of AI-powered chatbots, specifically ChatGPT, in identifying and correcting flaws in MCQs.

Methods: A three-phase-Interventional study was conducted from February to August 2023 at Riphah International University, Islamabad. In Phase-1, flawed MCQs were selected from the NBME guide and fed into ChatGPT. ChatGPT identified item flaws and suggested corrections. In Phase-2, ChatGPT was trained to detect flaws in MCQs with text data from the NBME item writing guide. In Phase-3, ChatGPT was again tested to detect flaws and correct MCQs. Data were analyzed using SPSS, Version 26 and presented using percentages and McNemar's test with exact conditional method.

Results: ChatGPT could identify and correct flaws such as use of "None of the above," "Grammatical cues," "absolute terms," and "inconsistently presented numerical data." However, it struggled with flaws related to "complicated stems," "long or complex options," and "vague frequency terms." After training, ChatGPT became better at identifying and correcting flaws related to complicated stems and absolute terms. It also struggled with recognizing "nonparallel options," "convergence," and "word repetition," both before and after training. ChatGPT's performance deteriorated during peak hours. The test of significance showed no measurable increase in ChatGPT's efficiency in detecting item flaws (p = 1.00) and correcting them (p = 0.125).

Conclusion: AI is revolutionizing industries and improving efficiency, but limitations exist in complex conversations, analysis, accuracy, and error prevention. Ongoing research is vital to unlocking AI's potential, especially in education.

人工智能与项目分析(AI meets IA):聊天机器人在检测和纠正 MCQ 缺陷方面的培训和性能研究。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Pakistan Journal of Medical Sciences
Pakistan Journal of Medical Sciences 医学-医学:内科
CiteScore
4.10
自引率
9.10%
发文量
363
审稿时长
3-6 weeks
期刊介绍: It is a peer reviewed medical journal published regularly since 1984. It was previously known as quarterly "SPECIALIST" till December 31st 1999. It publishes original research articles, review articles, current practices, short communications & case reports. It attracts manuscripts not only from within Pakistan but also from over fifty countries from abroad. Copies of PJMS are sent to all the import medical libraries all over Pakistan and overseas particularly in South East Asia and Asia Pacific besides WHO EMRO Region countries. Eminent members of the medical profession at home and abroad regularly contribute their write-ups, manuscripts in our publications. We pursue an independent editorial policy, which allows an opportunity to the healthcare professionals to express their views without any fear or favour. That is why many opinion makers among the medical and pharmaceutical profession use this publication to communicate their viewpoint.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信