Validity evidence for personality scores from algorithms trained on low-stakes verbal data and applied to high-stakes interviews

IF 2.6 4区 管理学 Q3 MANAGEMENT
Brent A. Stevenor, Louis Hickman, Michael J. Zickar, Fletcher Wimbush, Weston Beck
{"title":"Validity evidence for personality scores from algorithms trained on low-stakes verbal data and applied to high-stakes interviews","authors":"Brent A. Stevenor,&nbsp;Louis Hickman,&nbsp;Michael J. Zickar,&nbsp;Fletcher Wimbush,&nbsp;Weston Beck","doi":"10.1111/ijsa.12480","DOIUrl":null,"url":null,"abstract":"<p>We present multifaceted validity evidence for machine learning models (referred to as automated video interview personality assessments (AVI-PAs) in this research) that were trained on verbal data and interviewer ratings from low-stakes interviews and applied to high-stakes interviews to infer applicant personality. The predictive models used RoBERTa embeddings and binary unigrams as predictors. In Study 1 (<i>N</i> = 107), AVI-PAs more closely reflected interviewer ratings compared to applicant and reference ratings. Also, AVI-PAs and interviewer ratings had similar relations with applicants' interview behaviors, biographical information, and hireability. In Study 2 (<i>N</i> = 25), AVI-PAs had weak-moderate (nonsignificant) relations with subsequent supervisor ratings of job performance. Empirically, the AVI-PAs were most similar to interviewer ratings. AVI-PAs, interviewer ratings, self-reports, and reference-reports all demonstrated weak discriminant validity evidence. LASSO regression provided superior (but still weak) discriminant evidence compared to elastic net regression. Despite using natural language embeddings to operationalize verbal behavior, the AVI-PAs (except emotional stability) exhibited large correlations with interviewee word count. We discuss the implications of these findings for pre-employment personality assessments and effective AVI-PA design.</p>","PeriodicalId":51465,"journal":{"name":"International Journal of Selection and Assessment","volume":"32 4","pages":"544-560"},"PeriodicalIF":2.6000,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Selection and Assessment","FirstCategoryId":"91","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/ijsa.12480","RegionNum":4,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MANAGEMENT","Score":null,"Total":0}
引用次数: 0

Abstract

We present multifaceted validity evidence for machine learning models (referred to as automated video interview personality assessments (AVI-PAs) in this research) that were trained on verbal data and interviewer ratings from low-stakes interviews and applied to high-stakes interviews to infer applicant personality. The predictive models used RoBERTa embeddings and binary unigrams as predictors. In Study 1 (N = 107), AVI-PAs more closely reflected interviewer ratings compared to applicant and reference ratings. Also, AVI-PAs and interviewer ratings had similar relations with applicants' interview behaviors, biographical information, and hireability. In Study 2 (N = 25), AVI-PAs had weak-moderate (nonsignificant) relations with subsequent supervisor ratings of job performance. Empirically, the AVI-PAs were most similar to interviewer ratings. AVI-PAs, interviewer ratings, self-reports, and reference-reports all demonstrated weak discriminant validity evidence. LASSO regression provided superior (but still weak) discriminant evidence compared to elastic net regression. Despite using natural language embeddings to operationalize verbal behavior, the AVI-PAs (except emotional stability) exhibited large correlations with interviewee word count. We discuss the implications of these findings for pre-employment personality assessments and effective AVI-PA design.

从低分口语数据中训练出来的算法,应用于高分面试,为人格评分提供有效性证据
我们介绍了机器学习模型(在本研究中称为自动视频面试性格评估(AVI-PAs))的多方面有效性证据,这些模型是在低风险面试的言语数据和面试官评分基础上训练而成的,并应用于高风险面试以推断申请人的性格。预测模型使用 RoBERTa 嵌入和二元单词作为预测因子。在研究 1(N = 107)中,与申请人和推荐人的评分相比,AVI-PAs 更能反映面试官的评分。此外,AVI-PAs 和面试官评分与申请人的面试行为、履历信息和可雇佣性之间的关系相似。在研究 2(N = 25)中,AVI-PA 与随后主管对工作表现的评分有弱-中等(不显著)的关系。从经验上看,AVI-PAs 与面试官的评分最为相似。AVI-PAs、面试官评分、自我报告和参考报告都显示出了较弱的判别效度。与弹性网回归相比,LASSO 回归提供了更好(但仍然较弱)的判别证据。尽管使用了自然语言嵌入来操作言语行为,但 AVI-PA(情绪稳定性除外)与受访者字数表现出很大的相关性。我们讨论了这些发现对就业前人格评估和有效的 AVI-PA 设计的影响。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
4.10
自引率
31.80%
发文量
46
期刊介绍: The International Journal of Selection and Assessment publishes original articles related to all aspects of personnel selection, staffing, and assessment in organizations. Using an effective combination of academic research with professional-led best practice, IJSA aims to develop new knowledge and understanding in these important areas of work psychology and contemporary workforce management.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信