{"title":"Selection in the Operational Domain Requires More Than Grades: High School Marks do not Identify High-Flyers","authors":"Emil Lager, Kimmo Sorjonen, Marika Melin","doi":"10.1111/ijsa.70026","DOIUrl":"https://doi.org/10.1111/ijsa.70026","url":null,"abstract":"<p>As pilot selection comes under increasing scrutiny and pilot training becomes more theory-intensive, there is growing interest in whether academic metrics—such as high school grades—can predict training outcomes. While high school grades are predictive of academic success in traditional higher education, their predictive value in operational contexts remains underexplored. This study tested whether high school grades predict selection and training success in a professional pilot program. Data from 2111 applicants to LUSA in Sweden (2009–2019) were analyzed. Grades in Swedish, English, and Mathematics were examined as predictors—separately and as a composite score—of three outcomes: (a) admission, (b), theoretical course performance, and (c) graduation from flight training. Of the 2111 applicants, 169 (8%) were accepted and 147 (87%) successfully graduated. Results showed high school grades to have limited to no predictive utility across all outcomes. Only a few weak correlations emerged, but none remained significant after Bonferroni correction, and corrections for range restriction did not alter the findings. Our null findings are indicative of the competencies essential for pilot training not being well captured by academic grades once eligibility criteria are met. This reinforces the need for domain-specific and operationally relevant assessment tools in pilot selection.</p>","PeriodicalId":51465,"journal":{"name":"International Journal of Selection and Assessment","volume":"33 4","pages":""},"PeriodicalIF":2.4,"publicationDate":"2025-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/ijsa.70026","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145146853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Michelle P. Martín-Raugh, Emily A. Gallegos, Katrisha M. Smith, Ricardo R. Brooks, Harrison J. Kell
{"title":"The Validity of Single-Response Situational Judgment Tests: A Nomological Network Meta-Analysis","authors":"Michelle P. Martín-Raugh, Emily A. Gallegos, Katrisha M. Smith, Ricardo R. Brooks, Harrison J. Kell","doi":"10.1111/ijsa.70025","DOIUrl":"https://doi.org/10.1111/ijsa.70025","url":null,"abstract":"<p>Nearly 15 years after the first empirical validation of the then-novel single-response situational judgment test (SJT) methodology, research using single-response SJTs has proliferated. Single-response SJTs simply feature one edited critical incident that is evaluated by respondents–hence, the term “single-response” SJT. Single-response SJT items bypass the need for experts to generate and evaluate response options, simplifying and reducing the cost of test construction. We report the first meta-analysis of the criterion-related validity of single-response SJTs and explore the nomological network surrounding the procedural knowledge measured by this format. Results from a random-effects meta-analysis (<i>k</i> = 20, <i>N</i> = 3685) demonstrate that associations between antecedents of single-response SJT scores and criteria mirrored those in the multiple-response SJT literature, with positive associations in all cases. The reliability estimates for single-response SJTs ranged from <i>⍺</i> = 0.37 to <i>⍺</i> = 0.93, with an average of <i>⍺</i> = 0.82. The 95% confidence interval for the uncorrected correlation for single-response SJTs (95% CI [0.12, 28]) encompasses the validity correlations for multiple-response SJTs reported by McDaniel et al. (2007) (0.20, 0.26). We found that single-response SJTs correlated 0.18 (uncorrected) and 0.20 (corrected) with job performance. Additionally, we meta-analyze the correlations between single-response SJTs scores, personality and emotional intelligence, and also explore their criterion-related validity. Despite the nascency of this study area and that most studies were conducted in low-stakes lab settings, findings suggest that overall, single-response SJTs may be promising personnel selection tools.</p>","PeriodicalId":51465,"journal":{"name":"International Journal of Selection and Assessment","volume":"33 4","pages":""},"PeriodicalIF":2.4,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/ijsa.70025","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145110735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jidong Xue, Lu Zheng, Jingyi Li, Lina Wang, Kaiguang Liang, Carl, Jinyan Fan
{"title":"Examining the Effect of Social Desirability on the Relationships Between Personality Traits and Safety Performance","authors":"Jidong Xue, Lu Zheng, Jingyi Li, Lina Wang, Kaiguang Liang, Carl, Jinyan Fan","doi":"10.1111/ijsa.70024","DOIUrl":"https://doi.org/10.1111/ijsa.70024","url":null,"abstract":"<div>\u0000 \u0000 <p>Personality inventories can be used to identify safety-prone employees when making selection and job placement decisions. Self-report personality scores, however, are susceptible to social desirability. For personality inventories to be used with confidence in selecting and placing employees in safety-related positions, it is necessary to understand how social desirability influences the criterion-related validity of personality scores. In response, the current study examined whether two types of social desirability, self-deceptive enhancement (SDE) and impression management (IM), suppress or moderate the relationship between personality scores and safety performance. Participants in this concurrent validation study were 95 blue-collar employees working at a chemical firm in China, who completed a self-report personality measure, and their supervisors rated their safety performance. Results indicated that (a) conscientiousness scores, but not agreeableness and emotional stability scores, were positively related to safety performance, and (b) SDE and IM scores moderated, rather than suppressed, the criterion-related validity of conscientiousness scores in predicting safety performance ratings such that the validity was stronger when SDE or IM scores were lower. These findings suggested that both SDE and IM may introduce criterion-irrelevant error into personality scores that weaken personality validity. Implications, study limitations, and future research directions were discussed.</p></div>","PeriodicalId":51465,"journal":{"name":"International Journal of Selection and Assessment","volume":"33 4","pages":""},"PeriodicalIF":2.4,"publicationDate":"2025-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145037899","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Item Response Theory Analysis of a Work Ethic Scale: Evidence From Multi-Industry Samples in West Africa","authors":"Simon Ntumi","doi":"10.1111/ijsa.70023","DOIUrl":"https://doi.org/10.1111/ijsa.70023","url":null,"abstract":"<div>\u0000 \u0000 <p>In light of growing interest in cross-cultural workforce behaviors and the need for psychometrically sound tools to assess employee values, this study investigated the validity and fairness of a Work Ethic Scale across key industries in West Africa. This study examined the psychometric properties of an adapted Work Ethic Scale across major industries in West Africa using item response theory (IRT). A sample of 800 full-time employees from education, finance, manufacturing, and healthcare sectors in Ghana and Nigeria completed the scale. A two-parameter logistic model was applied to assess item discrimination, difficulty, differential item functioning (DIF), and measurement precision. Dimensionality assessment via exploratory factor analysis and Horn's parallel analysis confirmed essential unidimensionality. The first factor had an eigenvalue of 6.87 (34.35% variance explained), exceeding the parallel analysis criterion of 1.42. Subsequent factors fell below their respective thresholds, and model fit was confirmed [KMO = 0.91; Bartlett's χ²(190) = 2892.31, <i>p</i> < 0.001; TLI = 0.91; RMSR = 0.04]. IRT analysis showed strong item-level performance. Discrimination values (a-parameters) ranged from 0.98 to 2.25, with 17 of 20 items exceeding 1.20. The highest discrimination was observed for “I believe in earning rewards through effort” (<i>a</i> = 2.25, SE = 0.15), and the lowest for “I see leisure as less important than work” (<i>a</i> = 0.98, SE = 0.09). Difficulty parameters (<i>b</i>-values) ranged from −0.70 to 0.80, indicating that most items effectively targeted mid-levels of the work ethic trait. Model fit for all items was adequate (S-X² <i>p</i> values > 0.05). DIF analysis revealed five items with significant DIF across industries. Uniform DIF was found in “Work is central to my life” [χ²(3) = 19.88, <i>p</i> = 0.0002, <i>η</i>² = 0.08], while nonuniform DIF was observed for “I feel guilty when I'm not being productive” [χ²(3) = 21.10, <i>p</i> = 0.0001, <i>η</i>² = 0.07]. Test information function analysis showed peak information at θ = 0.0 (<i>I</i> = 1.93, SE = 0.79, <i>R</i> = 0.92), with decreasing precision at θ = ±2.0. A key recommendation is to revise or add items to enhance measurement precision at the lower and higher extremes of work ethic, ensuring a more balanced assessment across all trait levels.</p></div>","PeriodicalId":51465,"journal":{"name":"International Journal of Selection and Assessment","volume":"33 4","pages":""},"PeriodicalIF":2.4,"publicationDate":"2025-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144915273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jaclyn Martin Kowal, Kenzie Hurley Bryant, Dan Segall, Tracy Kantrowitz
{"title":"Harnessing Generative AI for Assessment Item Development: Comparing AI-Generated and Human-Authored Items","authors":"Jaclyn Martin Kowal, Kenzie Hurley Bryant, Dan Segall, Tracy Kantrowitz","doi":"10.1111/ijsa.70021","DOIUrl":"https://doi.org/10.1111/ijsa.70021","url":null,"abstract":"<p>The use of generative AI, specifically large language models (LLMs), in test development presents an innovative approach to efficiently creating technical, knowledge-based assessment items. This study evaluates the efficacy of AI-generated items compared to human-authored counterparts within the context of employee selection testing, focusing on data science knowledge areas. Through a paired comparison approach, subject matter experts (SMEs) were asked to evaluate items produced by both LLMs and human item writers. Findings revealed a significant preference for LLM-generated items, particularly in specific knowledge domains such as Statistical Foundations and Scientific Data Analysis. However, despite the promise of generative AI in accelerating item development, human review remains critical. Issues such as multiple correct answers or ineffective distractors in AI-generated items necessitate thorough SME review and revision to ensure quality and validity. The study highlights the potential of integrating AI with human expertise to enhance the efficiency of item generation while maintaining psychometric standards in high-stakes environments. The implications for psychometric practice and the necessity of domain-specific validation are discussed, offering a framework for future research and application of AI in test development.</p>","PeriodicalId":51465,"journal":{"name":"International Journal of Selection and Assessment","volume":"33 3","pages":""},"PeriodicalIF":2.4,"publicationDate":"2025-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/ijsa.70021","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144910476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Isabeau Van Strydonck, Vincent Goossens, Arnold B. Bakker, Evangelia Demerouti, Adelien Decramer, Mieke Audenaert
{"title":"Linking Strengths-Based Supervisor Feedback to Job Crafting: The Moderating Role of Self-Efficacy","authors":"Isabeau Van Strydonck, Vincent Goossens, Arnold B. Bakker, Evangelia Demerouti, Adelien Decramer, Mieke Audenaert","doi":"10.1111/ijsa.70020","DOIUrl":"https://doi.org/10.1111/ijsa.70020","url":null,"abstract":"<p>Strengths-based supervisor feedback focuses on identifying, appreciating, and utilizing employees' unique qualities at work. Building on Job Demands-Resources (JD-R) theory, we hypothesize that strengths-based supervisor feedback is positively related to employee job crafting through employee work engagement. Moreover, we challenge the idea that strengths-based supervisor feedback is equally beneficial to all employees, introducing employees' personal resources (self-efficacy) as a boundary condition. The results of a time-separated study using reports from 244 employees showed that T1 strengths-based supervisor feedback was positively related to T3 employee job crafting through T2 work engagement. Moreover, results demonstrated that receiving strengths-based supervisor feedback was especially important for employees low (vs. high) in self-efficacy. For employees who already strongly believed in their abilities to successfully perform their work tasks (i.e., who had high levels of self-efficacy), strengths-based supervisor feedback did not contribute to their work engagement and job crafting. The indirect effect between strengths-based supervisor feedback and job crafting via work engagement was replicated in a second time-separated survey study among 280 employees.</p>","PeriodicalId":51465,"journal":{"name":"International Journal of Selection and Assessment","volume":"33 3","pages":""},"PeriodicalIF":2.4,"publicationDate":"2025-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/ijsa.70020","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144869798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Timothy G. Wingate, Chet Robie, Deborah M. Powell, Joshua S. Bourdage
{"title":"The Signals That Matter: Resumes, Cover Letters, and Success on the Job Search","authors":"Timothy G. Wingate, Chet Robie, Deborah M. Powell, Joshua S. Bourdage","doi":"10.1111/ijsa.70022","DOIUrl":"https://doi.org/10.1111/ijsa.70022","url":null,"abstract":"<p>Although organizations commonly collect resumes and cover letters to screen job applicants, the features of these materials that affect hiring remain unclear. The current study tracked 183 students applying for full-time, paid jobs in their field, in connection with a postsecondary co-operative education program in Canada. We found that applicants whose cover letters and resumes were written with more detail, clarity, and structure secured substantially more interviews (relative to the number of applications submitted) and took less time to secure a position. These effects remained strong after controlling for content factors (work experience and achievement) and tailoring of materials, suggesting that organizations may be hiring applicants based on compositional factors that can be easily produced by new technologies like ChatGPT.</p>","PeriodicalId":51465,"journal":{"name":"International Journal of Selection and Assessment","volume":"33 3","pages":""},"PeriodicalIF":2.4,"publicationDate":"2025-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/ijsa.70022","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144843410","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Robert W. Loy, Neil D. Christiansen, Robert P. Tett, Katherine Klein, Margaret Toich
{"title":"Personality Test Validity Differs Between Low-Stakes and High-Stakes Employment Settings","authors":"Robert W. Loy, Neil D. Christiansen, Robert P. Tett, Katherine Klein, Margaret Toich","doi":"10.1111/ijsa.70018","DOIUrl":"https://doi.org/10.1111/ijsa.70018","url":null,"abstract":"<div>\u0000 \u0000 <p>The impact of applicant faking on personality test validity in high-stakes settings remains debated in personnel selection research, with some arguing it distorts scores while others suggest minimal effects on validity. This meta-analysis compares personality test validity across low-stakes (e.g., employee assessments) and high-stakes (e.g., applicant testing) settings. Results show validity was consistently higher in low-stakes settings across both unmatched and matched samples. In unmatched studies, personality test validity was higher in low-stakes settings (<i>r'</i> = 0.17, <i>k</i> = 20, <i>N</i> = 8883) than in high-stakes settings (<i>r'</i> = 0.13, <i>k</i> = 215, N = 68,372). Matched studies showed a substantial difference, where low-stakes validity (<i>r'</i> = 0.27) was 125% larger than high-stakes validity (<i>r'</i> = 0.12). These findings provide strong empirical evidence that faking substantially reduces personality test validity in selection contexts. We recommend organizations treat low-stakes validity evidence as provisional and use it only for interim hiring decisions until high-stakes validation data is available. To improve selection accuracy, organizations should prioritize validation studies in motivated samples, apply statistical corrections for faking, and implement faking-resistant measures (e.g., forced-choice formats).</p></div>","PeriodicalId":51465,"journal":{"name":"International Journal of Selection and Assessment","volume":"33 3","pages":""},"PeriodicalIF":2.4,"publicationDate":"2025-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144832443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Michelle Martin-Raugh, Jonathan F. Kochert, Steven Holtzman, Harrison J. Kell
{"title":"The Role of Expertise in Scoring Situational Judgment Tests: Is the Juice Worth the Squeeze?","authors":"Michelle Martin-Raugh, Jonathan F. Kochert, Steven Holtzman, Harrison J. Kell","doi":"10.1111/ijsa.70019","DOIUrl":"https://doi.org/10.1111/ijsa.70019","url":null,"abstract":"<p>Drawing from theory regarding the general and domain-specific knowledge typically measured by situational judgment tests (SJTs), this study examined the effects of subject matter experts (SME) qualifications on SJT scoring keys and their associated predictive validity. Although one may expect that scoring keys generated using the judgments of more qualified SMEs would result in higher predictive validity, an exploration of any gains in incremental validity associated with the keys is warranted to determine whether gains are meaningful, and perhaps more importantly, worth the additional resources required to obtain highly qualified SMEs. We created three distinct SJT scoring keys for an SJT designed to measure cross-cultural competence based on a sample of crowdsourced novices, a sample of incumbents, and a sample of highly trained and experienced SMEs and examined their predictive validity. Findings from a time-lagged, year-long concurrent validity study (<i>N</i> = 350) provide some support for the idea that using particularly qualified, highly experienced SMEs to develop SJT scoring keys may provide a meaningful increase in the predictive validity of the assessment over using a crowdsourced novice sample or a convenience sample of incumbents that are often used in practice.</p>","PeriodicalId":51465,"journal":{"name":"International Journal of Selection and Assessment","volume":"33 3","pages":""},"PeriodicalIF":2.6,"publicationDate":"2025-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/ijsa.70019","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144716678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Consideration of Immediate and Long-Term Consequences of Generative AI Usage in Selection","authors":"Agata Mirowska","doi":"10.1111/ijsa.70017","DOIUrl":"https://doi.org/10.1111/ijsa.70017","url":null,"abstract":"<p>The focal article by Lievens and Dunlop (2025) recognizes the potential benefits of applicant use of Generative AI in the selection process. Nevertheless, this commentary argues that organizations need to consider how to incorporate such usage into the selection system, to ensure that the proper criteria are being assessed. Furthermore, I argue for a longer-term perspective, proposing that Generative AI use in the selection process without proper reflection may lead to a “sabotaging” effect on a) prediction of candidate future job performance and b) potential future work relationships and ensuing organizational climate.</p>","PeriodicalId":51465,"journal":{"name":"International Journal of Selection and Assessment","volume":"33 3","pages":""},"PeriodicalIF":2.6,"publicationDate":"2025-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/ijsa.70017","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144611999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}