并非所有epa都是平等的：用效用模型修正抽样偏差。

IF 2.1

Journal of surgical education Pub Date : 2025-09-25 DOI:10.1016/j.jsurg.2025.103708

Phillip Jenkins, Ali Oran, Carolyn C Chang, Jonathan Jesneck, Julie Doberne, Ruchi Thanawala

{"title":"并非所有epa都是平等的：用效用模型修正抽样偏差。","authors":"Phillip Jenkins, Ali Oran, Carolyn C Chang, Jonathan Jesneck, Julie Doberne, Ruchi Thanawala","doi":"10.1016/j.jsurg.2025.103708","DOIUrl":null,"url":null,"abstract":"Background: Entrustable professional activities (EPAs) are foundational for understanding resident progress towards practice readiness. Unfortunately, when EPAs were initiated manually, EPA assessment completion has been uneven, creating biases from assessment variability across individuals, specialties, and institutions. Therefore, we introduce EPA assessment utility modeling, which can retrospectively correct for and prospectively avoid these biases by informing each attending of the usefulness of each EPA assessment opportunity and highlighting when EPA assessments are most needed.Methods: We performed a longitudinal analysis of general surgery EPA assessments using an EHR-integrable medical-education platform across 37 institutions. EPA assessment counts were fitted with power law curves to measure skewing. Raw EPA assessment ratings, combined with historical case logs and OR schedules, were analyzed with the platform's large-scale Bayesian network model to quantify each EPA assessment's impact on entrustment learning curves. Lastly, we used Monte Carlo simulations to develop an assessment utility score, as an intuitive label for the predicted benefit of each EPA assessment opportunity, in order to prompt faculty members to complete the most highly useful assessments.Results: From 6/2023 to 5/2025, 444 faculty assessed 532 residents with 17,245 EPA assessments. EPA assessment counts showed substantial skewing across several factors. By EPA type, 52.8% of EPA assessments were of the top 4 (22.2%) types (power law α = 0.27, 2 p ≈ 0). By faculty, 33.5% of EPA assessments were from the most active 15 (4.3%) faculty members (α = 0.15, 2 p ≈ 0). By faculty specialty, 31.0% were from the most active 2 (9.5%) specialties (α = 0.24, 2 p ≈ 0). By resident, 20.1% were received by the 20 (4.5%) most assessed residents (α = 0.21, 2 p ≈ 0).Conclusion: EPA assessments were heavily skewed with sampling biases, misrepresenting entrustment levels. To fix these biases and provide a data-driven approach to CBE measurement, we propose an assessment utility framework to optimize EPA assessment timing, assessor, and prioritization.","PeriodicalId":94109,"journal":{"name":"Journal of surgical education","volume":" ","pages":"103708"},"PeriodicalIF":2.1000,"publicationDate":"2025-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Not All EPAs Are Created Equal: Fixing Sampling Bias With Utility Modeling.\",\"authors\":\"Phillip Jenkins, Ali Oran, Carolyn C Chang, Jonathan Jesneck, Julie Doberne, Ruchi Thanawala\",\"doi\":\"10.1016/j.jsurg.2025.103708\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background: Entrustable professional activities (EPAs) are foundational for understanding resident progress towards practice readiness. Unfortunately, when EPAs were initiated manually, EPA assessment completion has been uneven, creating biases from assessment variability across individuals, specialties, and institutions. Therefore, we introduce EPA assessment utility modeling, which can retrospectively correct for and prospectively avoid these biases by informing each attending of the usefulness of each EPA assessment opportunity and highlighting when EPA assessments are most needed.Methods: We performed a longitudinal analysis of general surgery EPA assessments using an EHR-integrable medical-education platform across 37 institutions. EPA assessment counts were fitted with power law curves to measure skewing. Raw EPA assessment ratings, combined with historical case logs and OR schedules, were analyzed with the platform's large-scale Bayesian network model to quantify each EPA assessment's impact on entrustment learning curves. Lastly, we used Monte Carlo simulations to develop an assessment utility score, as an intuitive label for the predicted benefit of each EPA assessment opportunity, in order to prompt faculty members to complete the most highly useful assessments.Results: From 6/2023 to 5/2025, 444 faculty assessed 532 residents with 17,245 EPA assessments. EPA assessment counts showed substantial skewing across several factors. By EPA type, 52.8% of EPA assessments were of the top 4 (22.2%) types (power law α = 0.27, 2 p ≈ 0). By faculty, 33.5% of EPA assessments were from the most active 15 (4.3%) faculty members (α = 0.15, 2 p ≈ 0). By faculty specialty, 31.0% were from the most active 2 (9.5%) specialties (α = 0.24, 2 p ≈ 0). By resident, 20.1% were received by the 20 (4.5%) most assessed residents (α = 0.21, 2 p ≈ 0).Conclusion: EPA assessments were heavily skewed with sampling biases, misrepresenting entrustment levels. To fix these biases and provide a data-driven approach to CBE measurement, we propose an assessment utility framework to optimize EPA assessment timing, assessor, and prioritization.\",\"PeriodicalId\":94109,\"journal\":{\"name\":\"Journal of surgical education\",\"volume\":\" \",\"pages\":\"103708\"},\"PeriodicalIF\":2.1000,\"publicationDate\":\"2025-09-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of surgical education\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1016/j.jsurg.2025.103708\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of surgical education","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1016/j.jsurg.2025.103708","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

背景：可信赖的专业活动（EPAs）是了解实习准备的基础。不幸的是，当EPA被手动启动时，EPA评估的完成是不平衡的，造成了个体、专业和机构之间评估可变性的偏差。因此，我们引入了EPA评估效用模型，该模型可以通过通知每个与会者每个EPA评估机会的有用性并突出显示最需要EPA评估的时间来回顾性地纠正和前瞻性地避免这些偏差。方法：我们使用ehr集成的医学教育平台对37家机构的普外科EPA评估进行了纵向分析。EPA评价计数用幂律曲线拟合以测量偏度。原始EPA评估评级，结合历史案例日志和OR时间表，使用平台的大规模贝叶斯网络模型进行分析，以量化每个EPA评估对委托学习曲线的影响。最后，我们使用蒙特卡罗模拟来开发评估效用分数，作为每个EPA评估机会预测收益的直观标签，以提示教师完成最有用的评估。结果：从2023年6月到2025年5月，444名教师评估了532名居民，进行了17,245次EPA评估。环境保护局的评估计数显示了几个因素之间的严重偏差。按EPA类型划分，52.8%的EPA评价为前4种（22.2%）类型（幂律α = 0.27,2 p ≈ 0）。从教师的角度来看，33.5%的EPA评估来自最活跃的15名教师（4.3%）（α = 0.15,2 p ≈ 0）。按院系专业划分，最活跃的2个专业占31.0% (9.5%)（α = 0.24,2 p ≈ 0）。在居民中，20名（4.5%）被评估最多的居民获得20.1% （α = 0.21,2 p ≈ 0）。结论：EPA评估因抽样偏差而严重偏斜，曲解了委托水平。为了修正这些偏差并提供数据驱动的CBE测量方法，我们提出了一个评估实用框架来优化EPA评估的时间、评估者和优先级。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Not All EPAs Are Created Equal: Fixing Sampling Bias With Utility Modeling.

Background: Entrustable professional activities (EPAs) are foundational for understanding resident progress towards practice readiness. Unfortunately, when EPAs were initiated manually, EPA assessment completion has been uneven, creating biases from assessment variability across individuals, specialties, and institutions. Therefore, we introduce EPA assessment utility modeling, which can retrospectively correct for and prospectively avoid these biases by informing each attending of the usefulness of each EPA assessment opportunity and highlighting when EPA assessments are most needed.

Methods: We performed a longitudinal analysis of general surgery EPA assessments using an EHR-integrable medical-education platform across 37 institutions. EPA assessment counts were fitted with power law curves to measure skewing. Raw EPA assessment ratings, combined with historical case logs and OR schedules, were analyzed with the platform's large-scale Bayesian network model to quantify each EPA assessment's impact on entrustment learning curves. Lastly, we used Monte Carlo simulations to develop an assessment utility score, as an intuitive label for the predicted benefit of each EPA assessment opportunity, in order to prompt faculty members to complete the most highly useful assessments.

Results: From 6/2023 to 5/2025, 444 faculty assessed 532 residents with 17,245 EPA assessments. EPA assessment counts showed substantial skewing across several factors. By EPA type, 52.8% of EPA assessments were of the top 4 (22.2%) types (power law α = 0.27, 2 p ≈ 0). By faculty, 33.5% of EPA assessments were from the most active 15 (4.3%) faculty members (α = 0.15, 2 p ≈ 0). By faculty specialty, 31.0% were from the most active 2 (9.5%) specialties (α = 0.24, 2 p ≈ 0). By resident, 20.1% were received by the 20 (4.5%) most assessed residents (α = 0.21, 2 p ≈ 0).

Conclusion: EPA assessments were heavily skewed with sampling biases, misrepresenting entrustment levels. To fix these biases and provide a data-driven approach to CBE measurement, we propose an assessment utility framework to optimize EPA assessment timing, assessor, and prioritization.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of surgical education

自引率

0.00%

发文量