面试者可以伪造AI吗？比较自我报告、人类访谈评分和人工智能访谈评分中作假的易感性和机制

IF 2.4 4区管理学 Q3 MANAGEMENT

International Journal of Selection and Assessment Pub Date : 2025-05-06 DOI:10.1111/ijsa.70014

Louis Hickman, Josh Liff, Colin Willis, Emily Kim

{"title":"面试者可以伪造AI吗？比较自我报告、人类访谈评分和人工智能访谈评分中作假的易感性和机制","authors":"Louis Hickman, Josh Liff, Colin Willis, Emily Kim","doi":"10.1111/ijsa.70014","DOIUrl":null,"url":null,"abstract":"Artificial intelligence (AI) is increasingly used to score employment interviews in the early stages of the hiring process, but AI algorithms may be particularly prone to interviewee faking. Our study compared the extent to which people can improve their scores on self-report scales, structured and less structured human interview ratings, and AI interview ratings. Further, we replicate and extend prior research by examining how interviewee abilities and impression management tactics influence score inflation across scoring methods. Participants (N = 152) completed simulated, asynchronous interviews in honest and applicant-like conditions in a within-subjects design. The AI algorithms in the study were trained to replicate question-level structured interview ratings. Participants' scores increased most on self-reports (overall Cohen's d = 0.62) and least on AI interview ratings (overall Cohen's d = 0.14), although AI score increases were similar to those observed for human interview ratings (overall Cohen's d = 0.22). On average, across conditions, AI interview ratings converged more strongly with structured human ratings based on behaviorally anchored rating scales than with less structured human ratings. Verbal ability only predicted score improvement on self-reports, while increased use of honest defensive impression management tactics predicted improvement in AI and less structured human interview scores. Ability to identify criteria did not predict score improvement. Overall, these AI interview scores behaved similarly to structured human ratings. We discuss future possibilities for investigating faking in AI interviews, given that interviewees may try to “game” the system when aware that they are being evaluated by AI.","PeriodicalId":51465,"journal":{"name":"International Journal of Selection and Assessment","volume":"33 2","pages":""},"PeriodicalIF":2.4000,"publicationDate":"2025-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/ijsa.70014","citationCount":"0","resultStr":"{\"title\":\"Can Interviewees Fake Out AI? Comparing the Susceptibility and Mechanisms of Faking Across Self-Reports, Human Interview Ratings, and AI Interview Ratings\",\"authors\":\"Louis Hickman, Josh Liff, Colin Willis, Emily Kim\",\"doi\":\"10.1111/ijsa.70014\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Artificial intelligence (AI) is increasingly used to score employment interviews in the early stages of the hiring process, but AI algorithms may be particularly prone to interviewee faking. Our study compared the extent to which people can improve their scores on self-report scales, structured and less structured human interview ratings, and AI interview ratings. Further, we replicate and extend prior research by examining how interviewee abilities and impression management tactics influence score inflation across scoring methods. Participants (N = 152) completed simulated, asynchronous interviews in honest and applicant-like conditions in a within-subjects design. The AI algorithms in the study were trained to replicate question-level structured interview ratings. Participants' scores increased most on self-reports (overall Cohen's d = 0.62) and least on AI interview ratings (overall Cohen's d = 0.14), although AI score increases were similar to those observed for human interview ratings (overall Cohen's d = 0.22). On average, across conditions, AI interview ratings converged more strongly with structured human ratings based on behaviorally anchored rating scales than with less structured human ratings. Verbal ability only predicted score improvement on self-reports, while increased use of honest defensive impression management tactics predicted improvement in AI and less structured human interview scores. Ability to identify criteria did not predict score improvement. Overall, these AI interview scores behaved similarly to structured human ratings. We discuss future possibilities for investigating faking in AI interviews, given that interviewees may try to “game” the system when aware that they are being evaluated by AI.\",\"PeriodicalId\":51465,\"journal\":{\"name\":\"International Journal of Selection and Assessment\",\"volume\":\"33 2\",\"pages\":\"\"},\"PeriodicalIF\":2.4000,\"publicationDate\":\"2025-05-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1111/ijsa.70014\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Selection and Assessment\",\"FirstCategoryId\":\"91\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1111/ijsa.70014\",\"RegionNum\":4,\"RegionCategory\":\"管理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"MANAGEMENT\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Selection and Assessment","FirstCategoryId":"91","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/ijsa.70014","RegionNum":4,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MANAGEMENT","Score":null,"Total":0}

引用次数: 0

摘要

在招聘过程的早期阶段，人工智能（AI）越来越多地用于给面试打分，但人工智能算法可能特别容易让面试者作弊。我们的研究比较了人们在自我报告量表、结构化和非结构化人类访谈评分以及人工智能访谈评分方面提高得分的程度。此外，我们通过研究受访者的能力和印象管理策略如何影响评分方法的得分膨胀来复制和扩展先前的研究。参与者（N = 152）在受试者内部设计中，在诚实和类似申请人的条件下完成了模拟的异步面试。研究中的人工智能算法经过训练，可以复制问题级结构化访谈评级。参与者在自我报告上的得分增加最多（总体Cohen’s d = 0.62），在人工智能面试评分上的得分增加最少（总体Cohen’s d = 0.14），尽管人工智能得分的增长与人类面试评分的增长相似（总体Cohen’s d = 0.22）。平均而言，在各种情况下，人工智能面试评分与基于行为锚定评分量表的结构化人类评分的趋同程度要高于结构化程度较低的人类评分。语言能力只能预测自我报告的分数提高，而诚实的防御性印象管理策略的使用增加可以预测人工智能和不那么结构化的人类面试分数的提高。识别标准的能力并不能预测分数的提高。总的来说，这些人工智能面试分数的表现与结构化的人类评分相似。我们讨论了在人工智能面试中调查造假的未来可能性，因为当受访者意识到他们正在被人工智能评估时，他们可能会试图“玩弄”系统。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Can Interviewees Fake Out AI? Comparing the Susceptibility and Mechanisms of Faking Across Self-Reports, Human Interview Ratings, and AI Interview Ratings

Artificial intelligence (AI) is increasingly used to score employment interviews in the early stages of the hiring process, but AI algorithms may be particularly prone to interviewee faking. Our study compared the extent to which people can improve their scores on self-report scales, structured and less structured human interview ratings, and AI interview ratings. Further, we replicate and extend prior research by examining how interviewee abilities and impression management tactics influence score inflation across scoring methods. Participants (N = 152) completed simulated, asynchronous interviews in honest and applicant-like conditions in a within-subjects design. The AI algorithms in the study were trained to replicate question-level structured interview ratings. Participants' scores increased most on self-reports (overall Cohen's d = 0.62) and least on AI interview ratings (overall Cohen's d = 0.14), although AI score increases were similar to those observed for human interview ratings (overall Cohen's d = 0.22). On average, across conditions, AI interview ratings converged more strongly with structured human ratings based on behaviorally anchored rating scales than with less structured human ratings. Verbal ability only predicted score improvement on self-reports, while increased use of honest defensive impression management tactics predicted improvement in AI and less structured human interview scores. Ability to identify criteria did not predict score improvement. Overall, these AI interview scores behaved similarly to structured human ratings. We discuss future possibilities for investigating faking in AI interviews, given that interviewees may try to “game” the system when aware that they are being evaluated by AI.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Journal of Selection and Assessment Multiple-

CiteScore

4.10

自引率

31.80%

发文量

期刊介绍： The International Journal of Selection and Assessment publishes original articles related to all aspects of personnel selection, staffing, and assessment in organizations. Using an effective combination of academic research with professional-led best practice, IJSA aims to develop new knowledge and understanding in these important areas of work psychology and contemporary workforce management.