Can ChatGPT Outperform Humans in Faking a Personality Assessment While Avoiding Detection?

IF 2.4 4区管理学 Q3 MANAGEMENT

International Journal of Selection and Assessment Pub Date : 2025-06-01 DOI:10.1111/ijsa.70015

Chet Robie, Jane Phillips, Joshua S. Bourdage, Neil D. Christiansen, Patrick D. Dunlop, Stephen D. Risavy, Andrew B. Speer

{"title":"Can ChatGPT Outperform Humans in Faking a Personality Assessment While Avoiding Detection?","authors":"Chet Robie, Jane Phillips, Joshua S. Bourdage, Neil D. Christiansen, Patrick D. Dunlop, Stephen D. Risavy, Andrew B. Speer","doi":"10.1111/ijsa.70015","DOIUrl":null,"url":null,"abstract":"<p>Large language models (LLMs), such as ChatGPT, have reshaped opportunities and challenges across various fields, including human resources (HR). Concerns have arisen about the potential for personality assessment manipulation using LLMs, posing a risk to the validity of these tools. This threat is a reality: recent research suggests that many candidates are using AI to complete pre-hire assessments. This study addresses this problem by examining whether ChatGPT can outperform humans in faking personality assessments while avoiding detection. To explore this, two experiments were conducted focusing on assessing job-relevant traits, with and without coaching, and with two methods of identifying faking, specifically using an impression management (IM) measure and an overclaiming questionnaire (OCQ). For each study, we used responses from 100 working adults recruited via the Prolific platform, which were compared to 100 replications from ChatGPT. The results revealed that while ChatGPT showed some ability to manipulate assessments, without coaching it did not consistently outperform humans. Coaching had a minimal impact on reducing IM scores for either humans or ChatGPT, but reduced OCQ bias scores for ChatGPT. These findings highlight the limitations of current faking detection measures and emphasize the need for further research to refine methods for ensuring the integrity of personality assessments in HR, particularly as artificial intelligence becomes more available to candidates.</p>","PeriodicalId":51465,"journal":{"name":"International Journal of Selection and Assessment","volume":"33 3","pages":""},"PeriodicalIF":2.4000,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/ijsa.70015","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Selection and Assessment","FirstCategoryId":"91","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/ijsa.70015","RegionNum":4,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MANAGEMENT","Score":null,"Total":0}

引用次数: 0

Abstract

Large language models (LLMs), such as ChatGPT, have reshaped opportunities and challenges across various fields, including human resources (HR). Concerns have arisen about the potential for personality assessment manipulation using LLMs, posing a risk to the validity of these tools. This threat is a reality: recent research suggests that many candidates are using AI to complete pre-hire assessments. This study addresses this problem by examining whether ChatGPT can outperform humans in faking personality assessments while avoiding detection. To explore this, two experiments were conducted focusing on assessing job-relevant traits, with and without coaching, and with two methods of identifying faking, specifically using an impression management (IM) measure and an overclaiming questionnaire (OCQ). For each study, we used responses from 100 working adults recruited via the Prolific platform, which were compared to 100 replications from ChatGPT. The results revealed that while ChatGPT showed some ability to manipulate assessments, without coaching it did not consistently outperform humans. Coaching had a minimal impact on reducing IM scores for either humans or ChatGPT, but reduced OCQ bias scores for ChatGPT. These findings highlight the limitations of current faking detection measures and emphasize the need for further research to refine methods for ensuring the integrity of personality assessments in HR, particularly as artificial intelligence becomes more available to candidates.

Abstract Image

查看原文本刊更多论文

ChatGPT能在不被发现的情况下伪造人格评估，胜过人类吗？

大型语言模型（llm），如ChatGPT，重塑了包括人力资源（HR）在内的各个领域的机遇和挑战。人们已经开始关注使用法学硕士操纵人格评估的可能性，这对这些工具的有效性构成了风险。这种威胁已成为现实：最近的研究表明，许多候选人正在使用人工智能来完成招聘前评估。这项研究通过检验ChatGPT是否能在不被发现的情况下伪造人格评估，从而解决了这个问题。为了探讨这一点，我们进行了两个实验，重点是评估与工作相关的特征，在有和没有指导的情况下，以及两种识别虚假的方法，特别是使用印象管理（IM）测量和夸大问卷（OCQ）。对于每一项研究，我们都使用了通过多产平台招募的100名在职成年人的回复，并将其与ChatGPT的100份副本进行了比较。结果显示，虽然ChatGPT显示出一些操纵评估的能力，但在没有指导的情况下，它的表现并不总是优于人类。教练对降低人类或ChatGPT的IM分数的影响很小，但降低了ChatGPT的OCQ偏见分数。这些发现突出了当前虚假检测措施的局限性，并强调了进一步研究以完善确保人力资源人格评估完整性的方法的必要性，特别是在人工智能对候选人越来越可用的情况下。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International Journal of Selection and Assessment Multiple-

CiteScore

4.10

自引率

31.80%

发文量

期刊介绍： The International Journal of Selection and Assessment publishes original articles related to all aspects of personnel selection, staffing, and assessment in organizations. Using an effective combination of academic research with professional-led best practice, IJSA aims to develop new knowledge and understanding in these important areas of work psychology and contemporary workforce management.