Interpreting validity evidence: It is time to end the horse race

IF 4.3 3区心理学 Q1 PSYCHOLOGY, APPLIED

Industrial and Organizational Psychology-Perspectives on Science and Practice Pub Date : 2023-08-31 DOI:10.1017/iop.2023.27

Kevin Murphy

{"title":"Interpreting validity evidence: It is time to end the horse race","authors":"Kevin Murphy","doi":"10.1017/iop.2023.27","DOIUrl":null,"url":null,"abstract":"For almost 25 years, two conclusions arising from a series of meta-analyses (summarized by Schmidt & Hunter, 1998) have been widely accepted in the field of I–O psychology: (a) that cognitive ability tests showed substantial validity as predictors of job performance, with scores on these tests accounting for over 25% of the variance in performance, and (b) cognitive ability tests were among the best predictors of performance and, taking into account their simplicity and broad applicability, were likely to be the starting point for most selection systems. Sackett, Zhang, Berry, and Lievens (2022) challenged these conclusions, showing how unrealistic corrections for range restriction in meta-analyses had led to substantial overestimates of the validity of most tests and assessments and suggesting that cognitive tests were not among the best predictors of performance. Sackett, Zhang, Berry and Lievens (2023) illustrate many implications important of their analysis for the evaluation of selection tests and or developing selection test batteries. Discussions of the validity of alternative predictors of performance often take on the character of a horse race, in which a great deal of attention is given to determining which is the best predictor. From this perspective, one of the messages of Sackett et al. (2022) might be that cognitive ability has been dethroned as the best predictor, and that structured interviews, job knowledge tests, empirically keyed biodata forms and work sample tests are all better choices. In my view, dethroning cognitive ability tests as the best predictor is one of the least important conclusions of the Sackett et al. (2022) review. Although horse races might be fun, the quest to find the best single predictor of performance is arguably pointless because personnel selection is inherently a multivariate problem, not a univariate one. First, personnel selection is virtually never done based on scores on a single test or assessment. There are certainly scenarios where a low score on a single assessment might lead to a negative selection decision; an applicant for a highly selective college who submits a combined SAT score of 560 (320 in Math and 240 in Evidence-Based Reading and Writing) will almost certainly be rejected. However, real-world selection decisions that are based on any type of systematic assessments will usually be based on multiple assessments (e.g., interviews plus tests, biodata plus interviews). More to the point, the criteria that are used to evaluate the validity and value of selection tests are almost certainly multivariate. That is, although selection tests are often validated against supervisory ratings of job performance, they are not designed or used to predict these ratings, which often show uncertain relationships with actual effectiveness in the workplace (Adler et al., 2016; Murphy et al., 2018). Rather, they are used to help organizations make decisions, and assessing the quality of these decisions often requires the consideration of multiple criteria. Virtually all meta-analyses of selection test validity take a univariate perspective, usually examining the relationship between test scores and measures of job performance (as noted above, usually supervisory ratings, but sometimes objective measures or measures of training outcomes). Thus, validity if often expressed in terms of a single number (e.g., the corrected correlation","PeriodicalId":47771,"journal":{"name":"Industrial and Organizational Psychology-Perspectives on Science and Practice","volume":"16 1","pages":"341 - 343"},"PeriodicalIF":4.3000,"publicationDate":"2023-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Industrial and Organizational Psychology-Perspectives on Science and Practice","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1017/iop.2023.27","RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHOLOGY, APPLIED","Score":null,"Total":0}

引用次数: 1

Abstract

For almost 25 years, two conclusions arising from a series of meta-analyses (summarized by Schmidt & Hunter, 1998) have been widely accepted in the field of I–O psychology: (a) that cognitive ability tests showed substantial validity as predictors of job performance, with scores on these tests accounting for over 25% of the variance in performance, and (b) cognitive ability tests were among the best predictors of performance and, taking into account their simplicity and broad applicability, were likely to be the starting point for most selection systems. Sackett, Zhang, Berry, and Lievens (2022) challenged these conclusions, showing how unrealistic corrections for range restriction in meta-analyses had led to substantial overestimates of the validity of most tests and assessments and suggesting that cognitive tests were not among the best predictors of performance. Sackett, Zhang, Berry and Lievens (2023) illustrate many implications important of their analysis for the evaluation of selection tests and or developing selection test batteries. Discussions of the validity of alternative predictors of performance often take on the character of a horse race, in which a great deal of attention is given to determining which is the best predictor. From this perspective, one of the messages of Sackett et al. (2022) might be that cognitive ability has been dethroned as the best predictor, and that structured interviews, job knowledge tests, empirically keyed biodata forms and work sample tests are all better choices. In my view, dethroning cognitive ability tests as the best predictor is one of the least important conclusions of the Sackett et al. (2022) review. Although horse races might be fun, the quest to find the best single predictor of performance is arguably pointless because personnel selection is inherently a multivariate problem, not a univariate one. First, personnel selection is virtually never done based on scores on a single test or assessment. There are certainly scenarios where a low score on a single assessment might lead to a negative selection decision; an applicant for a highly selective college who submits a combined SAT score of 560 (320 in Math and 240 in Evidence-Based Reading and Writing) will almost certainly be rejected. However, real-world selection decisions that are based on any type of systematic assessments will usually be based on multiple assessments (e.g., interviews plus tests, biodata plus interviews). More to the point, the criteria that are used to evaluate the validity and value of selection tests are almost certainly multivariate. That is, although selection tests are often validated against supervisory ratings of job performance, they are not designed or used to predict these ratings, which often show uncertain relationships with actual effectiveness in the workplace (Adler et al., 2016; Murphy et al., 2018). Rather, they are used to help organizations make decisions, and assessing the quality of these decisions often requires the consideration of multiple criteria. Virtually all meta-analyses of selection test validity take a univariate perspective, usually examining the relationship between test scores and measures of job performance (as noted above, usually supervisory ratings, but sometimes objective measures or measures of training outcomes). Thus, validity if often expressed in terms of a single number (e.g., the corrected correlation

查看原文本刊更多论文

解读有效性证据：是时候结束赛马了

近25年来，一系列荟萃分析得出的两个结论（由Schmidt&Hunter总结，1998）在输入输出心理学领域被广泛接受：（a）认知能力测试作为工作表现的预测因子显示出显著的有效性，这些测试的得分占表现差异的25%以上，（b）认知能力测试是表现的最佳预测因素之一，考虑到其简单性和广泛的适用性，很可能是大多数选择系统的起点。Sackett、Zhang、Berry和Lievens（2022）对这些结论提出了质疑，表明在荟萃分析中对范围限制的不切实际的校正是如何导致对大多数测试和评估的有效性的大幅高估的，并表明认知测试不是表现的最佳预测因素之一。Sackett、Zhang、Berry和Lievens（2023）阐述了他们的分析对选择测试评估和/或开发选择测试电池的许多重要意义。关于表现的替代预测因子的有效性的讨论通常具有赛马的特点，在赛马中，人们非常关注确定哪一个是最佳预测因子。从这个角度来看，Sackett等人（2022）的信息之一可能是，认知能力已被取代为最佳预测因素，结构化面试、工作知识测试、凭经验键入的生物数据表和工作样本测试都是更好的选择。在我看来，取代认知能力测试作为最佳预测指标是Sackett等人（2022）综述中最不重要的结论之一。尽管赛马可能很有趣，但寻找最佳的单一绩效预测指标无疑毫无意义，因为人员选择本质上是一个多变量问题，而不是一个单变量问题。首先，人员选择实际上从来不是基于单一测试或评估的分数。当然，在某些情况下，单一评估的低分数可能会导致负面的选择决定；一个高选择性大学的申请人，如果提交了560分的SAT综合成绩（数学320分，循证阅读和写作240分），几乎肯定会被拒绝。然而，基于任何类型的系统评估的现实世界的选择决策通常都是基于多重评估（例如，访谈加测试、生物数据加访谈）。更重要的是，用于评估选择测试的有效性和价值的标准几乎可以肯定是多变量的。也就是说，尽管选择测试通常根据工作表现的监督评级进行验证，但它们并不是设计或用于预测这些评级的，这些评级往往显示出与工作场所实际有效性的不确定关系（Adler等人，2016；Murphy等人，2018）。相反，它们用于帮助组织做出决策，评估这些决策的质量通常需要考虑多个标准。事实上，所有选择测试有效性的荟萃分析都采用单变量视角，通常考察测试分数和工作表现衡量标准之间的关系（如上所述，通常是监督评级，但有时是客观衡量标准或培训结果衡量标准）。因此，如果有效性通常用单个数字表示（例如，校正的相关性

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Industrial and Organizational Psychology-Perspectives on Science and Practice PSYCHOLOGY, APPLIED-

CiteScore

7.70

自引率

10.10%

发文量

期刊介绍： Industrial and Organizational Psychology-Perspectives on Science and Practice is a peer-reviewed academic journal published on behalf of the Society for Industrial and Organizational Psychology. The journal focuses on interactive exchanges on topics of importance to the science and practice of the field. It features articles that present new ideas or different takes on existing ideas, stimulating dialogue about important issues in the field. Additionally, the journal is indexed and abstracted in Clarivate Analytics SSCI, Clarivate Analytics Web of Science, European Reference Index for the Humanities and Social Sciences (ERIH PLUS), ProQuest, PsycINFO, and Scopus.