Comparison Between Serial and Independent Questions: A Psychometric and Methodological Approach.

IF 1.6 Q2 EDUCATION, SCIENTIFIC DISCIPLINES

Journal of Medical Education and Curricular Development Pub Date : 2025-07-15 eCollection Date: 2025-01-01 DOI:10.1177/23821205251359701

Víctor Hugo Olmedo Canchola, José Gamaliel Velazco González, Gustavo Quiroga Martínez

{"title":"Comparison Between Serial and Independent Questions: A Psychometric and Methodological Approach.","authors":"Víctor Hugo Olmedo Canchola, José Gamaliel Velazco González, Gustavo Quiroga Martínez","doi":"10.1177/23821205251359701","DOIUrl":null,"url":null,"abstract":"Objective: To determine if statistical and psychometric outcomes differ between tests composed of serial and independent questions. Specific goals include assessing which format provides better reliability and validity, understanding response patterns, and comparing difficulty and discrimination indices under classical test theory.Methodology: The study involved a single-group design with spiral counterbalance, allowing examinees to answer both formats within a single exam of 220 items. Of these, 200 were independent questions, and 20 were organized into 4 clinical cases with 5 related items each. The exam was administered by computer to anesthesiologists undergoing certification or recertification.Results: From 2109 candidates, the analysis showed significant differences in internal consistency, with Cronbach's alpha of .790 for independent questions and .527 for serial questions. A moderate positive correlation (r = .488) between scores in the 2 formats was observed. No significant difference was found in difficulty and discrimination indices between formats.Discussion: Independent questions showed higher reliability, likely due to their lack of dependency, making them more suitable for high-stakes exams. Serial questions, while valuable for assessing integrative reasoning, introduce dependency that affects consistency and may skew outcomes when the initial question is answered incorrectly. Despite similar difficulty and discrimination indices, the unique dependency in serial questions affects their suitability for high-stakes testing.Conclusions: Independent questions provide a more reliable format for high-stakes exams, but serial questions can enhance assessments by probing various aspects of clinical reasoning within a single case. A balanced approach incorporating both formats may optimize the reliability and validity of medical certification exams, leveraging the strengths of each question type.","PeriodicalId":45121,"journal":{"name":"Journal of Medical Education and Curricular Development","volume":"12 ","pages":"23821205251359701"},"PeriodicalIF":1.6000,"publicationDate":"2025-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12267950/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Medical Education and Curricular Development","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1177/23821205251359701","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"EDUCATION, SCIENTIFIC DISCIPLINES","Score":null,"Total":0}

引用次数: 0

Abstract

Objective: To determine if statistical and psychometric outcomes differ between tests composed of serial and independent questions. Specific goals include assessing which format provides better reliability and validity, understanding response patterns, and comparing difficulty and discrimination indices under classical test theory.

Methodology: The study involved a single-group design with spiral counterbalance, allowing examinees to answer both formats within a single exam of 220 items. Of these, 200 were independent questions, and 20 were organized into 4 clinical cases with 5 related items each. The exam was administered by computer to anesthesiologists undergoing certification or recertification.

Results: From 2109 candidates, the analysis showed significant differences in internal consistency, with Cronbach's alpha of .790 for independent questions and .527 for serial questions. A moderate positive correlation (r = .488) between scores in the 2 formats was observed. No significant difference was found in difficulty and discrimination indices between formats.

Discussion: Independent questions showed higher reliability, likely due to their lack of dependency, making them more suitable for high-stakes exams. Serial questions, while valuable for assessing integrative reasoning, introduce dependency that affects consistency and may skew outcomes when the initial question is answered incorrectly. Despite similar difficulty and discrimination indices, the unique dependency in serial questions affects their suitability for high-stakes testing.

Conclusions: Independent questions provide a more reliable format for high-stakes exams, but serial questions can enhance assessments by probing various aspects of clinical reasoning within a single case. A balanced approach incorporating both formats may optimize the reliability and validity of medical certification exams, leveraging the strengths of each question type.

Abstract Image

查看原文本刊更多论文

系列问题和独立问题的比较：心理测量学和方法论方法。

目的：确定由系列问题和独立问题组成的测试的统计和心理测量结果是否不同。具体目标包括评估哪种格式具有更好的信度和效度，理解反应模式，比较经典测试理论下的难度和区别指标。方法：该研究采用螺旋平衡的单组设计，允许考生在220个单项考试中同时回答两种格式。其中200个为独立问题，20个为4个临床病例，每个病例5个相关问题。该考试由计算机对正在进行认证或重新认证的麻醉师进行管理。结果：从2109个候选人中，分析显示内部一致性有显著差异，独立问题的Cronbach's alpha为0.790，连续问题的Cronbach's alpha为0.527。在两种格式的得分之间观察到中度正相关（r = .488）。不同格式的难度和区分指标均无显著差异。讨论：独立题表现出更高的可靠性，可能是由于它们缺乏依赖性，使它们更适合高风险的考试。系列问题虽然对评估综合推理很有价值，但会引入依赖性，影响一致性，当最初的问题回答错误时，可能会扭曲结果。尽管相似的难度和区别指数，独特的依赖性系列问题影响其适合高风险测试。结论：独立问题为高风险考试提供了更可靠的形式，但系列问题可以通过在单个病例中探索临床推理的各个方面来增强评估。结合两种格式的平衡方法可以优化医疗认证考试的可靠性和有效性，充分利用每种问题类型的优势。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Medical Education and Curricular Development EDUCATION, SCIENTIFIC DISCIPLINES-

自引率

0.00%

发文量

审稿时长

8 weeks