Inter-rater reliability and content validity of the measurement tool for portfolio assessments used in the Introduction to Clinical Medicine course at Ewha Womans University College of Medicine: a methodological study.
{"title":"Inter-rater reliability and content validity of the measurement tool for portfolio assessments used in the Introduction to Clinical Medicine course at Ewha Womans University College of Medicine: a methodological study.","authors":"Dong-Mi Yoo, Jae Jin Han","doi":"10.3352/jeehp.2024.21.39","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>This study aimed to examine the reliability and validity of a measurement tool for portfolio assessments in medical education. Specifically, it investigated scoring consistency among raters and assessment criteria appropriateness according to an expert panel.</p><p><strong>Methods: </strong>A cross-sectional observational study was conducted from September to December 2018 for the Introduction to Clinical Medicine course at the Ewha Womans University College of Medicine. Data were collected for 5 randomly selected portfolios scored by a gold-standard rater and 6 trained raters. An expert panel assessed the validity of 12 assessment items using the content validity index (CVI). Statistical analysis included Pearson correlation coefficients for rater alignment, the intraclass correlation coefficient (ICC) for inter-rater reliability, and the CVI for item-level validity.</p><p><strong>Results: </strong>Rater 1 had the highest Pearson correlation (0.8916) with the gold-standard rater, while Rater 5 had the lowest (0.4203). The ICC for all raters was 0.3821, improving to 0.4415 after excluding Raters 1 and 5, indicating a 15.6% reliability increase. All assessment items met the CVI threshold of ≥0.75, with some achieving a perfect score (CVI=1.0). However, items like \"sources\" and \"level and degree of performance\" showed lower validity (CVI=0.72).</p><p><strong>Conclusion: </strong>The present measurement tool for portfolio assessments demonstrated moderate reliability and strong validity, supporting its use as a credible tool. For a more reliable portfolio assessment, more faculty training is needed.</p>","PeriodicalId":46098,"journal":{"name":"Journal of Educational Evaluation for Health Professions","volume":"21 ","pages":"39"},"PeriodicalIF":9.3000,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11717432/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Educational Evaluation for Health Professions","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3352/jeehp.2024.21.39","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/12/10 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"EDUCATION, SCIENTIFIC DISCIPLINES","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose: This study aimed to examine the reliability and validity of a measurement tool for portfolio assessments in medical education. Specifically, it investigated scoring consistency among raters and assessment criteria appropriateness according to an expert panel.
Methods: A cross-sectional observational study was conducted from September to December 2018 for the Introduction to Clinical Medicine course at the Ewha Womans University College of Medicine. Data were collected for 5 randomly selected portfolios scored by a gold-standard rater and 6 trained raters. An expert panel assessed the validity of 12 assessment items using the content validity index (CVI). Statistical analysis included Pearson correlation coefficients for rater alignment, the intraclass correlation coefficient (ICC) for inter-rater reliability, and the CVI for item-level validity.
Results: Rater 1 had the highest Pearson correlation (0.8916) with the gold-standard rater, while Rater 5 had the lowest (0.4203). The ICC for all raters was 0.3821, improving to 0.4415 after excluding Raters 1 and 5, indicating a 15.6% reliability increase. All assessment items met the CVI threshold of ≥0.75, with some achieving a perfect score (CVI=1.0). However, items like "sources" and "level and degree of performance" showed lower validity (CVI=0.72).
Conclusion: The present measurement tool for portfolio assessments demonstrated moderate reliability and strong validity, supporting its use as a credible tool. For a more reliable portfolio assessment, more faculty training is needed.
期刊介绍:
Journal of Educational Evaluation for Health Professions aims to provide readers the state-of-the art practical information on the educational evaluation for health professions so that to increase the quality of undergraduate, graduate, and continuing education. It is specialized in educational evaluation including adoption of measurement theory to medical health education, promotion of high stakes examination such as national licensing examinations, improvement of nationwide or international programs of education, computer-based testing, computerized adaptive testing, and medical health regulatory bodies. Its field comprises a variety of professions that address public medical health as following but not limited to: Care workers Dental hygienists Dental technicians Dentists Dietitians Emergency medical technicians Health educators Medical record technicians Medical technologists Midwives Nurses Nursing aides Occupational therapists Opticians Oriental medical doctors Oriental medicine dispensers Oriental pharmacists Pharmacists Physical therapists Physicians Prosthetists and Orthotists Radiological technologists Rehabilitation counselor Sanitary technicians Speech-language therapists.