Colin Conrad , Anika Nissen , Kya Masoumi , Mayank Ramchandani , Rafael Fecury Braga , Aaron J. Newman
{"title":"Do truthfulness notifications influence perceptions of AI-generated political images? A cognitive investigation with EEG","authors":"Colin Conrad , Anika Nissen , Kya Masoumi , Mayank Ramchandani , Rafael Fecury Braga , Aaron J. Newman","doi":"10.1016/j.chbah.2025.100185","DOIUrl":"10.1016/j.chbah.2025.100185","url":null,"abstract":"<div><div>Political misinformation is a growing problem for democracies, partly due to the rise of widely accessible artificial intelligence-generated content (AIGC). In response, social media platforms are increasingly considering explicit AI content labeling, though the evidence to support the effectiveness of this approach has been mixed. In this paper, we discuss two studies which shed light on antecedent cognitive processes that help explain why and how AIGC labeling impacts user evaluations in the specific context of AI-generated political images. In the first study, we conducted a neurophysiological experiment with 26 participants using EEG event-related potentials (ERPs) and self-report measures to gain deeper insights into the brain processes associated with the evaluations of artificially generated political images and AIGC labels. In the second study, we embedded some of the stimuli from the EEG study into replica YouTube recommendations and administered them to 276 participants online. The results from the two studies suggest that AI-generated political images are associated with heightened attentional and emotional processing. These responses are linked to perceptions of humanness and trustworthiness. Importantly, trustworthiness perceptions can be impacted by effective AIGC labels. We found effects traceable to the brain’s late-stage executive network activity, as reflected by patterns of the P300 and late positive potential (LPP) components. Our findings suggest that AIGC labeling can be an effective approach for addressing online misinformation when the design is carefully considered. Future research could extend these results by pairing more photorealistic stimuli with ecologically valid social-media tasks and multimodal observation techniques to refine label design and personalize interventions across demographic segments.</div></div>","PeriodicalId":100324,"journal":{"name":"Computers in Human Behavior: Artificial Humans","volume":"5 ","pages":"Article 100185"},"PeriodicalIF":0.0,"publicationDate":"2025-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144686532","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Justin W. Carter, Justin T. Scott, John D. Barrett
{"title":"The emotional cost of AI chatbots in education: Who benefits and who struggles?","authors":"Justin W. Carter, Justin T. Scott, John D. Barrett","doi":"10.1016/j.chbah.2025.100181","DOIUrl":"10.1016/j.chbah.2025.100181","url":null,"abstract":"<div><div>Recent advancements in large language models have enabled the development of advanced chatbots, offering new opportunities for personalized learning and academic support that could transform the way students learn. Despite their growing popularity and promising benefits, there is limited understanding of the psychological impact. Accordingly, this study examined the effects of chatbot usage on students' positive and negative affect and considered the moderating role of familiarity. Using a pre-post control group design, undergraduate students were divided into two groups to completed an assignment. Groups received the same task, and only differed based on receiving instruction to use or not to use an AI chatbot. Students who used a chatbot reported significantly lower positive affect, with no significant difference in negative affect. Importantly, familiarity with chatbots moderated changes in positive affect such that students with more familiarity with chatbots reported fewer declines. These findings showcase chatbots’ duplicitous effects. While the tools may prove empowering with effective use, they can also decrease the positive aspects of completing assignments for those with less familiarity. These findings underscore the behavioral complexity of AI integration by highlighting how familiarity moderates affective outcomes and how chatbot use may reduce positive emotional engagement without increasing negative affect. Integrating AI tools in education requires not just access and training, but a nuanced understanding of how student behavior and emotional well-being are shaped by their interaction with intelligent systems.</div></div>","PeriodicalId":100324,"journal":{"name":"Computers in Human Behavior: Artificial Humans","volume":"5 ","pages":"Article 100181"},"PeriodicalIF":0.0,"publicationDate":"2025-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144655271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ye Wang , Tong Li , Meixuan Li , Ziyue Cheng , Ge Wang , Hanyue Kang , Yaling Deng , Hongjiang Xiao , Yuan Zhang
{"title":"RVBench: Role values benchmark for role-playing LLMs","authors":"Ye Wang , Tong Li , Meixuan Li , Ziyue Cheng , Ge Wang , Hanyue Kang , Yaling Deng , Hongjiang Xiao , Yuan Zhang","doi":"10.1016/j.chbah.2025.100184","DOIUrl":"10.1016/j.chbah.2025.100184","url":null,"abstract":"<div><div>With the explosive development of Large Language Models (LLMs), the demand for role-playing agents has greatly increased to promote applications such as personalized digital companion and artificial society simulation. In LLM-driven role-playing, the values of agents lay the foundation for their attitudes and behaviors, thus alignment of values is crucial in enhancing the realism of interactions and enriching the user experience. However, a benchmark for evaluating values in role-playing LLMs is absent. In this study, we built a Role Values Dataset (RVD) containing 25 roles as the groundtruth. Additionally, inspired by psychological tests in humans, we proposed a Role Values Benchmark (RVBench) including values rating and values ranking methods to evaluate the values of role-playing LLMs from subjective questionnaires and observed behavior. The values rating method tests the values orientation through the revised Portrait Values Questionnaire (PVQ-RR), which provides a direct and quantitative comparison of the roles to be played. The values ranking method assesses whether the behaviors of agents are consistent with their values’ hierarchical organization when encountering dilemmatic scenarios. Subsequent testing on a selection of both open-source and closed-source LLMs revealed that GLM-4 exhibited values most closely mirroring the roles in the RVD. However, compared to preset roles, there is still a certain gap in the role-playing ability of LLMs, including the consistency, stability and flexibility in value dimensions. These findings prompt a vital need for further research aimed at refining the role-playing capacities of LLMs from a value alignment perspective. The RVD is available at: <span><span>https://github.com/northwang/RVD</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":100324,"journal":{"name":"Computers in Human Behavior: Artificial Humans","volume":"5 ","pages":"Article 100184"},"PeriodicalIF":0.0,"publicationDate":"2025-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144614216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Elizabeth R. Merwin , Allen C. Hagen , Joseph R. Keebler , Chad Forbes
{"title":"Self-disclosure to AI: People provide personal information to AI and humans equivalently","authors":"Elizabeth R. Merwin , Allen C. Hagen , Joseph R. Keebler , Chad Forbes","doi":"10.1016/j.chbah.2025.100180","DOIUrl":"10.1016/j.chbah.2025.100180","url":null,"abstract":"<div><div>As Artificial Intelligence (AI) increasingly emerges as a tool in therapeutic settings, understanding individuals' willingness to disclose personal information to AI versus humans is critical. This study examined how participants chose between self-disclosure-based and fact-based statements when responses were thought to be analyzed by an AI, a human researcher, or kept private. Participants completed forced-choice trials where they selected a self-disclosure-based or fact-based statement for one of the three agent conditions. Results showed that participants were statistically more likely to select self-disclosure over fact-based statements. Choice for self-disclosure rates were similar for the AI and human researcher, but significantly lower when responses were kept private. Multiple regression analyses revealed that individuals with a higher score on the negative attitude toward AI scale were less likely to choose Self-based statements across the three agent conditions. Overall, individuals were just as likely to choose to self-disclose to an AI as to a human researcher, and more likely to choose either agent over keeping self-disclosure information private. In addition, personality traits and attitudes toward AI were able to significantly influence disclosure choices. These findings provide insights into how individual differences impact the willingness to self-disclose information in human-AI interactions and offer a foundation for exploring the feasibility of AI as a clinical and social tool. Future research should expand on these results to further understand self-disclosure behaviors and evaluate AI's role in therapeutic settings.</div></div>","PeriodicalId":100324,"journal":{"name":"Computers in Human Behavior: Artificial Humans","volume":"5 ","pages":"Article 100180"},"PeriodicalIF":0.0,"publicationDate":"2025-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144604432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Arthur Bran Herbener , Michał Klincewicz , Lily Frank , Malene Flensborg Damholdt
{"title":"A critical discussion of strategies and ramifications of implementing conversational agents in mental healthcare","authors":"Arthur Bran Herbener , Michał Klincewicz , Lily Frank , Malene Flensborg Damholdt","doi":"10.1016/j.chbah.2025.100182","DOIUrl":"10.1016/j.chbah.2025.100182","url":null,"abstract":"<div><div>In recent years, there has been growing optimism about the potential of conversational agents, such as chatbots and social robots, in mental healthcare. Their scalability offers a promising solution to some of the key limitations of the dominant model of treatment in Western countries. However, while recent experimental research provides grounds for cautious optimism, the integration of conversational agents into mental healthcare raises significant clinical and ethical challenges, particularly concerning the partial or full replacement of human practitioners. Overall, this theoretical paper examines the clinical and ethical implications of deploying conversational agents in mental health services as <em>partial</em> and <em>full</em> replacement of human practitioners. On the one hand, we outline how these agents can circumvent core treatment barriers through stepped care, blended care, and a personalized medicine approach. On the other hand, we argue that the partial and full substitution of human practitioners can have profound consequences for the ethical landscape of mental healthcare, potentially undermining patients’ rights and safety. By making this argument, this work extends prior literature by specifically considering how different levels of implementation of conversational agents in healthcare present both opportunities and risks. We argue for the urgent need to establish regulatory frameworks to ensure that the integration of conversational agents into mental healthcare is both safe and ethically sound.</div></div>","PeriodicalId":100324,"journal":{"name":"Computers in Human Behavior: Artificial Humans","volume":"5 ","pages":"Article 100182"},"PeriodicalIF":0.0,"publicationDate":"2025-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144614215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Laura M. Vowels , Rachel R.R. Francois-Walcott , Maëlle Grandjean , Joëlle Darwiche , Matthew J. Vowels
{"title":"Navigating relationships with GenAI chatbots: User attitudes, acceptability, and potential","authors":"Laura M. Vowels , Rachel R.R. Francois-Walcott , Maëlle Grandjean , Joëlle Darwiche , Matthew J. Vowels","doi":"10.1016/j.chbah.2025.100183","DOIUrl":"10.1016/j.chbah.2025.100183","url":null,"abstract":"<div><div>Despite the growing adoption of GenAI chatbots in health and well-being contexts, little is known about public attitudes toward their use for relationship support or the factors shaping acceptance and effectiveness. This study aims to address the research gap across three studies. Study 1 involved five focus groups with 30 young people to gauge general attitudes toward GenAI chatbots in relationship contexts. Study 2 evaluated user experiences during a single relationship intervention session with 20 participants. Study 3 quantitatively measured changes in attitudes toward GenAI chatbots and online interventions among 260 participants, assessed before, immediately after, and two weeks following their interaction with a GenAI chatbot or a writing task. Three main themes emerged in Studies 1 and 2: <em>Accessible First-Line Treatment, Artificial Advice for Human Connection</em>, and <em>Internet Archive</em>. Additionally, Study 1 revealed themes of <em>Privacy vs. Openness</em> and <em>Are We in a Black Mirror Episode?</em>, while Study 2 uncovered themes of <em>Exceeding Expectations</em> and Supporting <em>Neurodivergence</em>. The Study 3 results indicated that GenAI chatbot interactions led to reduced effort expectancy and short-term effects in increased acceptance and decreased objections to GenAI chatbots, though these effects were not sustained at a two-week follow-up. Both intervention types improved general attitudes toward online interventions, suggesting that exposure can enhance the uptake of digital health tools. This research underscores the evolving role of GenAI chatbots in augmenting therapeutic practices, highlighting their potential for personalized, accessible, and effective relationship interventions in the digital age.</div></div>","PeriodicalId":100324,"journal":{"name":"Computers in Human Behavior: Artificial Humans","volume":"5 ","pages":"Article 100183"},"PeriodicalIF":0.0,"publicationDate":"2025-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144604433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gabriella Warren-Smith , Guy Laban , Emily-Marie Pacheco , Emily S. Cross
{"title":"Knowledge cues to human origins facilitate self-disclosure during interactions with chatbots","authors":"Gabriella Warren-Smith , Guy Laban , Emily-Marie Pacheco , Emily S. Cross","doi":"10.1016/j.chbah.2025.100174","DOIUrl":"10.1016/j.chbah.2025.100174","url":null,"abstract":"<div><div>Chatbots are emerging as a self-management tool for supporting mental health, appearing across commercial and healthcare settings. Whilst chatbots are valued for their perceived lack of judgement, they lack the emotional intelligence and empathy to build trust and rapport with users. A resulting debate questions whether chatbots facilitate or hinder self-disclosure. This study presents a within-subjects experimental design investigating the parameters of self-disclosure in social interactions with chatbots in an open domain. Participants engaged in two short social interactions with two chatbots: one with the knowledge they were conversing with a chatbot and one with the false belief they were conversing with a human. A significant difference was found across both treatments, with participants disclosing more to the chatbot that was introduced as a human, as well as perceiving themselves to do so, perceiving this chatbot as more comforting, and to be demonstrating higher rates of agency and experience compared to the chatbot that was introduced as a chatbot. However, significant findings also indicated participants’ disclosures to the chatbot that was introduced as a chatbot were more sentimental, and they found it to be friendlier compared to the chatbot that was introduced as a human. These results indicate that whilst cues to a chatbot’s human origins enhance self-disclosure and perceptions of mind, when the artificial agent is perceived against one’s social expectations, it may be viewed negatively on social factors that require higher cognitive processing.</div></div>","PeriodicalId":100324,"journal":{"name":"Computers in Human Behavior: Artificial Humans","volume":"5 ","pages":"Article 100174"},"PeriodicalIF":0.0,"publicationDate":"2025-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144632459","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sherif Abdelkarim , David Lu , Dora-Luz Flores , Susanne Jaeggi , Pierre Baldi
{"title":"Evaluating the Intelligence of large language models: A comparative study using verbal and visual IQ tests","authors":"Sherif Abdelkarim , David Lu , Dora-Luz Flores , Susanne Jaeggi , Pierre Baldi","doi":"10.1016/j.chbah.2025.100170","DOIUrl":"10.1016/j.chbah.2025.100170","url":null,"abstract":"<div><div>Large language models (LLMs) excel on many specialized benchmarks, yet their general-reasoning ability remains opaque. We therefore test 18 models – including GPT-4, Claude 3 and Gemini Pro – on a 14-section IQ suite spanning verbal, numerical and visual puzzles and add a “multi-agent reflection” variant in which one model answers while others critique and revise. Results replicate known patterns: a strong bias towards verbal vs numerical reasoning (GPT-4: 79% vs 53% accuracy), a pronounced modality gap (text-IQ <span><math><mo>≈</mo></math></span> 125 vs visual-IQ <span><math><mo>≈</mo></math></span> 103), and persistent failure on abstract arithmetic (<span><math><mo>≤</mo></math></span> 20% on missing-number tasks). Scaling lifts mean IQ from 89 (tiny models) to 131 (large models), but gains are non-uniform, and reflection yields only modest extra points for frontier systems. Our contributions include: (1) proposing an evaluation framework for LLM “intelligence” using both verbal and visual IQ tasks, (2) analyzing how multi-agent setups with varying actor and critic sizes affect problem-solving performance; (3) analyzing how model size and multi-modality affect performance across diverse reasoning tasks; and (4) highlighting the value of IQ tests as a standardized, human-referenced benchmark that enables longitudinal comparison of LLMs’ cognitive abilities relative to human norms. We further discuss the limitations of IQ tests as an AI benchmark and outline directions for more comprehensive evaluation of LLM reasoning capabilities.</div></div>","PeriodicalId":100324,"journal":{"name":"Computers in Human Behavior: Artificial Humans","volume":"5 ","pages":"Article 100170"},"PeriodicalIF":0.0,"publicationDate":"2025-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144470648","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Marie Hornberger , Arne Bewersdorff , Daniel S. Schiff , Claudia Nerdel
{"title":"Development and validation of a short AI literacy test (AILIT-S) for university students","authors":"Marie Hornberger , Arne Bewersdorff , Daniel S. Schiff , Claudia Nerdel","doi":"10.1016/j.chbah.2025.100176","DOIUrl":"10.1016/j.chbah.2025.100176","url":null,"abstract":"<div><div>Fostering AI literacy is an important goal in higher education in many disciplines. Assessing AI literacy can inform researchers and educators on current AI literacy levels and provide insights into the effectiveness of learning and teaching in the field of AI. It can also inform decision-makers and policymakers about the successes and gaps with respect to AI literacy within certain institutions, populations, or countries, for example. However, most of the available AI literacy tests are quite long and time-consuming. A short test of AI literacy would instead enable efficient measurement and facilitate better research and understanding. In this study, we develop and validate a short version of an existing validated AI literacy test. Based on a sample of 1,465 university students across three Western countries (Germany, UK, US), we select a subset of items according to content validity, coverage of different difficulty levels, and ability to discriminate between participants. The resulting short version, AILIT-S, consists of 10 items and can be used to assess AI literacy in under 5 minutes. While the shortened test is less reliable than the long version, it maintains high construct validity and has high congruent validity. We offer recommendations for researchers and practitioners on when to use the long or short version.</div></div>","PeriodicalId":100324,"journal":{"name":"Computers in Human Behavior: Artificial Humans","volume":"5 ","pages":"Article 100176"},"PeriodicalIF":0.0,"publicationDate":"2025-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144470639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Antonella D'Amico , Giuseppina Paci , Laura di Domenico , Alessandro Geraci
{"title":"Using educational robotics to support motor, cognitive, and social skills in a child with spinal muscular atrophy. A single-case study","authors":"Antonella D'Amico , Giuseppina Paci , Laura di Domenico , Alessandro Geraci","doi":"10.1016/j.chbah.2025.100175","DOIUrl":"10.1016/j.chbah.2025.100175","url":null,"abstract":"<div><div>This study reports the results of a single-case intervention involving a child with spinal muscular atrophy. The aim of the study was to promote fine motor skills, visual-motor integration, attentional behaviors, and learning. The treatment was based on the RE4BES protocol, which consists of a set of guidelines for conducting tailored educational robotics activities designed for children with special needs. We employed an experimental single-case ABA design, including Baseline 1 (A1), Treatment (B), and Baseline 2 (A2), with eight sessions per phase. The treatment phase involved activities with Blue-Bot and LEGO® WeDo 2.0. Results showed significant improvements in gross and fine motor skills from baseline to the treatment phase, with these gains maintained after the intervention. Moreover, in alignment with the main goals of school inclusion for people with special needs, results demonstrated that the intervention also improved awareness, flexibility, cooperation, and initiative within the classroom. Despite the study's limitations, the findings support the effectiveness of the RE4BES protocol and suggest that educational robotics can be a valuable tool in special education settings.</div></div>","PeriodicalId":100324,"journal":{"name":"Computers in Human Behavior: Artificial Humans","volume":"5 ","pages":"Article 100175"},"PeriodicalIF":0.0,"publicationDate":"2025-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144288715","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}