Designing Psychologically Grounded Artificial Intelligence for Supporting Bystander-Based Cyberaggression Intervention: Mixed Methods Exploratory Study.

IF 2 Q3 HEALTH CARE SCIENCES & SERVICES

JMIR Formative Research Pub Date : 2026-04-13 DOI:10.2196/84391

Jinkyung Katie Park, Pinxuan Alina Yu, Vignesh Krishnan, Huaye Li, Linda A Reddy, Vivek K Singh

{"title":"Designing Psychologically Grounded Artificial Intelligence for Supporting Bystander-Based Cyberaggression Intervention: Mixed Methods Exploratory Study.","authors":"Jinkyung Katie Park, Pinxuan Alina Yu, Vignesh Krishnan, Huaye Li, Linda A Reddy, Vivek K Singh","doi":"10.2196/84391","DOIUrl":null,"url":null,"abstract":"Background: Cyberaggression poses a growing threat to mental health, contributing to increased distress, reduced self-esteem, and other adverse psychosocial outcomes. Although bystander intervention can mitigate the escalation and impact of cyberaggression, individuals often lack the confidence, strategies, or language to respond effectively in these high-stakes online interactions. Advances in generative artificial intelligence (AI) present a novel opportunity to facilitate digital behavior change by assisting bystanders with contextually appropriate, theory-informed intervention messages that promote safer online environments and support mental well-being.Objective: This mixed methods design study aimed to explore the feasibility of using generative AI to support bystander intervention in cyberaggression on social media. Specifically, we examined whether AI can generate effective responses aligned with established intervention strategies and how these responses are perceived in terms of their potential to de-escalate online harm and foster behavior change.Methods: We collected 1000 real-world cyberaggression examples from public social media datasets and generated bystander intervention responses using 3 distinct prompt strategies: a generic policy reminder, a baseline GPT prompt, and a theory-driven GPT prompt (AllyGPT). To evaluate the responses, we conducted computational linguistic analyses to assess their psycholinguistic features and carried out a mixed methods evaluation. Three trained coders rated each message on favorability, conversational impact, and potential to change behavior and later participated in semistructured interviews to reflect on their evaluation process and perceptions of intervention effectiveness.Results: Linguistic analyses revealed that baseline GPT responses exhibited more emotionally positive and authentic language compared to AllyGPT responses, which showed a more analytical and assertive tone. Policy reminder messages were linguistically rigid and lacked emotional nuance. Human evaluation results showed that AllyGPT responses received the highest effectiveness ratings for low-incivil cyberaggression cases in 2 dimensions (favorability and changing behavior), and baseline GPT works better for mid and high levels for all effectiveness dimensions. For medium- and high-incivility aggressions, baseline GPT responses received the highest ratings across all 3 dimensions of effectiveness (favorability, discussion-shifting potential, and likelihood of changing bullying behavior), followed by AllyGPT, with policy reminders rated lowest. Qualitative feedback further emphasized that baseline GPT responses were perceived as natural and inclusive, while AllyGPT responses, although grounded in psychological theory, were sometimes viewed as overly direct. Policy reminders were considered clear but lacked persuasive impact.Conclusions: Our work showed that designing effective AI-generated bystander interventions requires a deep sensitivity to platform culture, social context, and user expectations. By combining psychological theory with adaptive, conversational design and ongoing feedback loops, future systems can better support bystanders, delivering interventions that are not only contextually appropriate but also socially resonant and behaviorally impactful. As such, this work serves as a foundation for scalable, human-centered AI systems that promote safer online spaces and users' mental well-being.","PeriodicalId":14841,"journal":{"name":"JMIR Formative Research","volume":"10 ","pages":"e84391"},"PeriodicalIF":2.0000,"publicationDate":"2026-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13075537/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JMIR Formative Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2196/84391","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}

引用次数: 0

Abstract

Background: Cyberaggression poses a growing threat to mental health, contributing to increased distress, reduced self-esteem, and other adverse psychosocial outcomes. Although bystander intervention can mitigate the escalation and impact of cyberaggression, individuals often lack the confidence, strategies, or language to respond effectively in these high-stakes online interactions. Advances in generative artificial intelligence (AI) present a novel opportunity to facilitate digital behavior change by assisting bystanders with contextually appropriate, theory-informed intervention messages that promote safer online environments and support mental well-being.

Objective: This mixed methods design study aimed to explore the feasibility of using generative AI to support bystander intervention in cyberaggression on social media. Specifically, we examined whether AI can generate effective responses aligned with established intervention strategies and how these responses are perceived in terms of their potential to de-escalate online harm and foster behavior change.

Methods: We collected 1000 real-world cyberaggression examples from public social media datasets and generated bystander intervention responses using 3 distinct prompt strategies: a generic policy reminder, a baseline GPT prompt, and a theory-driven GPT prompt (AllyGPT). To evaluate the responses, we conducted computational linguistic analyses to assess their psycholinguistic features and carried out a mixed methods evaluation. Three trained coders rated each message on favorability, conversational impact, and potential to change behavior and later participated in semistructured interviews to reflect on their evaluation process and perceptions of intervention effectiveness.

Results: Linguistic analyses revealed that baseline GPT responses exhibited more emotionally positive and authentic language compared to AllyGPT responses, which showed a more analytical and assertive tone. Policy reminder messages were linguistically rigid and lacked emotional nuance. Human evaluation results showed that AllyGPT responses received the highest effectiveness ratings for low-incivil cyberaggression cases in 2 dimensions (favorability and changing behavior), and baseline GPT works better for mid and high levels for all effectiveness dimensions. For medium- and high-incivility aggressions, baseline GPT responses received the highest ratings across all 3 dimensions of effectiveness (favorability, discussion-shifting potential, and likelihood of changing bullying behavior), followed by AllyGPT, with policy reminders rated lowest. Qualitative feedback further emphasized that baseline GPT responses were perceived as natural and inclusive, while AllyGPT responses, although grounded in psychological theory, were sometimes viewed as overly direct. Policy reminders were considered clear but lacked persuasive impact.

Conclusions: Our work showed that designing effective AI-generated bystander interventions requires a deep sensitivity to platform culture, social context, and user expectations. By combining psychological theory with adaptive, conversational design and ongoing feedback loops, future systems can better support bystanders, delivering interventions that are not only contextually appropriate but also socially resonant and behaviorally impactful. As such, this work serves as a foundation for scalable, human-centered AI systems that promote safer online spaces and users' mental well-being.

查看原文本刊更多论文

设计基于心理的人工智能来支持基于旁观者的网络攻击干预：混合方法的探索性研究。

背景：网络攻击对心理健康构成越来越大的威胁，导致痛苦增加、自尊降低和其他不利的心理社会后果。虽然旁观者干预可以减轻网络攻击的升级和影响，但个体往往缺乏信心、策略或语言来有效地应对这些高风险的在线互动。生成式人工智能（AI）的进步为促进数字行为改变提供了一个新的机会，它可以帮助旁观者提供适合情境的、有理论依据的干预信息，从而促进更安全的在线环境和支持心理健康。目的：本混合方法设计研究旨在探讨在社交媒体网络攻击中使用生成式人工智能支持旁观者干预的可行性。具体而言，我们研究了人工智能是否能够产生与既定干预策略相一致的有效反应，以及这些反应如何被视为降低在线伤害和促进行为改变的潜力。方法：我们从公共社交媒体数据集中收集了1000个真实世界的网络攻击案例，并使用3种不同的提示策略生成了旁观者干预响应：通用政策提醒、基线GPT提示和理论驱动的GPT提示（AllyGPT）。为了评估这些回答，我们进行了计算语言分析来评估他们的心理语言特征，并进行了混合方法评估。三名训练有素的编码员根据好感度、会话影响和改变行为的潜力对每条信息进行评分，然后参加半结构化访谈，以反映他们的评估过程和对干预效果的看法。结果：语言分析显示，与AllyGPT反应相比，基线GPT反应表现出更积极的情感和真实的语言，后者表现出更分析和自信的语气。政策提醒信息在语言上很死板，缺乏情感上的细微差别。人的评价结果表明，在2个维度上（好感度和改变行为），AllyGPT在低非民事网络攻击案例中获得了最高的有效性评价，而基线GPT在所有有效性维度上都表现得更好。对于中等和高度的不文明攻击，基线GPT反应在有效性的所有三个维度（好感度，讨论转移潜力和改变欺凌行为的可能性）中获得了最高的评分，其次是AllyGPT，政策提醒评分最低。定性反馈进一步强调，基线GPT反应被认为是自然和包容的，而AllyGPT反应，尽管基于心理学理论，有时被认为过于直接。政策提醒被认为是明确的，但缺乏说服力。结论：我们的工作表明，设计有效的人工智能生成的旁观者干预需要对平台文化、社会背景和用户期望具有深刻的敏感性。通过将心理学理论与适应性、对话式设计和持续反馈循环相结合，未来的系统可以更好地支持旁观者，提供的干预措施不仅适合情境，而且具有社会共鸣和行为影响力。因此，这项工作为可扩展的、以人为本的人工智能系统奠定了基础，这些系统可以促进更安全的在线空间和用户的心理健康。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊