Jessica Hane, Vivien Lee, You Zhou, Taj Mustapha, Susan M Culican, G Nic Rider, Paul R Sackett, Michael J Cullen
{"title":"检视住院医师和研究员在教师评估中定量评分和叙述性评论的性别差异。","authors":"Jessica Hane, Vivien Lee, You Zhou, Taj Mustapha, Susan M Culican, G Nic Rider, Paul R Sackett, Michael J Cullen","doi":"10.4300/JGME-D-24-00627.1","DOIUrl":null,"url":null,"abstract":"<p><p><b>Background</b> Learner assessments of faculty are widespread in medicine, yet concerns are growing about possible biases in these assessments and their associations with gender disparities. <b>Objective</b> To investigate gender-based differences in how residents and fellows describe faculty (rater effect) and how faculty are described (ratee effect) in faculty assessments, and their associations with teaching effectiveness ratings. <b>Methods</b> We analyzed 2164 trainee assessments of University of Minnesota Medical School faculty from 2019 to 2023 with trainee and faculty gender information and narrative comments. Using natural language processing, we categorized words and 2-word groups (n-grams) into communal (eg, caring, kind), standout (eg, outstanding, amazing), and agentic/ability (eg, assertive, controlling) groups. We examined gender-based differences in n-grams used by trainees (rater effect) and received by faculty (ratee effect), and relationships between n-gram and teaching effectiveness ratings. <b>Results</b> Women trainees used more communal (rater effect, incidence rate ratio [IRR]=1.36; 95% CI, 1.27-1.47), standout (IRR=1.20; 95% CI, 1.08-1.34), and agentic/ability words (IRR=1.37; 95% CI, 1.26-1.49; <i>P</i><.001) than men trainees. Women faculty received fewer agentic/ability words than men faculty (ratee effect, IRR=0.83; 95% CI, 0.77-0.90; <i>P</i><.001). Women trainees used fewer communal words when describing women faculty (interaction effect, IRR=0.84; 95% CI, 0.73-0.98; <i>P</i><.05). Teaching effectiveness ratings correlated with faculty n-gram word frequency in standout (men: <i>r<sub>s</sub></i> =0.29, women: <i>r<sub>s</sub>=</i>0.28, <i>P</i><.001) and communal categories (men: <i>r<sub>s</sub></i> =0.23, <i>P</i>=.003; women: <i>r<sub>s</sub>=</i>0.22, <i>P</i>=.01). <b>Conclusions</b> Women trainees used more communal, standout, and agentic/ability descriptors, while women faculty had fewer agentic/ability descriptors. Women trainees used fewer communal words when describing women faculty. Standout and communal word frequency predicted teaching effectiveness ratings for both genders.</p>","PeriodicalId":37886,"journal":{"name":"Journal of graduate medical education","volume":"17 3","pages":"338-346"},"PeriodicalIF":0.0000,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12168966/pdf/","citationCount":"0","resultStr":"{\"title\":\"Examining Gender-Based Differences in Quantitative Ratings and Narrative Comments in Faculty Assessments by Residents and Fellows.\",\"authors\":\"Jessica Hane, Vivien Lee, You Zhou, Taj Mustapha, Susan M Culican, G Nic Rider, Paul R Sackett, Michael J Cullen\",\"doi\":\"10.4300/JGME-D-24-00627.1\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p><b>Background</b> Learner assessments of faculty are widespread in medicine, yet concerns are growing about possible biases in these assessments and their associations with gender disparities. <b>Objective</b> To investigate gender-based differences in how residents and fellows describe faculty (rater effect) and how faculty are described (ratee effect) in faculty assessments, and their associations with teaching effectiveness ratings. <b>Methods</b> We analyzed 2164 trainee assessments of University of Minnesota Medical School faculty from 2019 to 2023 with trainee and faculty gender information and narrative comments. Using natural language processing, we categorized words and 2-word groups (n-grams) into communal (eg, caring, kind), standout (eg, outstanding, amazing), and agentic/ability (eg, assertive, controlling) groups. We examined gender-based differences in n-grams used by trainees (rater effect) and received by faculty (ratee effect), and relationships between n-gram and teaching effectiveness ratings. <b>Results</b> Women trainees used more communal (rater effect, incidence rate ratio [IRR]=1.36; 95% CI, 1.27-1.47), standout (IRR=1.20; 95% CI, 1.08-1.34), and agentic/ability words (IRR=1.37; 95% CI, 1.26-1.49; <i>P</i><.001) than men trainees. Women faculty received fewer agentic/ability words than men faculty (ratee effect, IRR=0.83; 95% CI, 0.77-0.90; <i>P</i><.001). Women trainees used fewer communal words when describing women faculty (interaction effect, IRR=0.84; 95% CI, 0.73-0.98; <i>P</i><.05). Teaching effectiveness ratings correlated with faculty n-gram word frequency in standout (men: <i>r<sub>s</sub></i> =0.29, women: <i>r<sub>s</sub>=</i>0.28, <i>P</i><.001) and communal categories (men: <i>r<sub>s</sub></i> =0.23, <i>P</i>=.003; women: <i>r<sub>s</sub>=</i>0.22, <i>P</i>=.01). <b>Conclusions</b> Women trainees used more communal, standout, and agentic/ability descriptors, while women faculty had fewer agentic/ability descriptors. Women trainees used fewer communal words when describing women faculty. Standout and communal word frequency predicted teaching effectiveness ratings for both genders.</p>\",\"PeriodicalId\":37886,\"journal\":{\"name\":\"Journal of graduate medical education\",\"volume\":\"17 3\",\"pages\":\"338-346\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12168966/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of graduate medical education\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.4300/JGME-D-24-00627.1\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/6/16 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of graduate medical education","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4300/JGME-D-24-00627.1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/6/16 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
Examining Gender-Based Differences in Quantitative Ratings and Narrative Comments in Faculty Assessments by Residents and Fellows.
Background Learner assessments of faculty are widespread in medicine, yet concerns are growing about possible biases in these assessments and their associations with gender disparities. Objective To investigate gender-based differences in how residents and fellows describe faculty (rater effect) and how faculty are described (ratee effect) in faculty assessments, and their associations with teaching effectiveness ratings. Methods We analyzed 2164 trainee assessments of University of Minnesota Medical School faculty from 2019 to 2023 with trainee and faculty gender information and narrative comments. Using natural language processing, we categorized words and 2-word groups (n-grams) into communal (eg, caring, kind), standout (eg, outstanding, amazing), and agentic/ability (eg, assertive, controlling) groups. We examined gender-based differences in n-grams used by trainees (rater effect) and received by faculty (ratee effect), and relationships between n-gram and teaching effectiveness ratings. Results Women trainees used more communal (rater effect, incidence rate ratio [IRR]=1.36; 95% CI, 1.27-1.47), standout (IRR=1.20; 95% CI, 1.08-1.34), and agentic/ability words (IRR=1.37; 95% CI, 1.26-1.49; P<.001) than men trainees. Women faculty received fewer agentic/ability words than men faculty (ratee effect, IRR=0.83; 95% CI, 0.77-0.90; P<.001). Women trainees used fewer communal words when describing women faculty (interaction effect, IRR=0.84; 95% CI, 0.73-0.98; P<.05). Teaching effectiveness ratings correlated with faculty n-gram word frequency in standout (men: rs =0.29, women: rs=0.28, P<.001) and communal categories (men: rs =0.23, P=.003; women: rs=0.22, P=.01). Conclusions Women trainees used more communal, standout, and agentic/ability descriptors, while women faculty had fewer agentic/ability descriptors. Women trainees used fewer communal words when describing women faculty. Standout and communal word frequency predicted teaching effectiveness ratings for both genders.
期刊介绍:
- Be the leading peer-reviewed journal in graduate medical education; - Promote scholarship and enhance the quality of research in the field; - Disseminate evidence-based approaches for teaching, assessment, and improving the learning environment; and - Generate new knowledge that enhances graduates'' ability to provide high-quality, cost-effective care.