Safeguarding Decentralized Social Media: LLM Agents for Automating Community Rule Compliance

IF 2.9 Q1 Social Sciences

Online Social Networks and Media Pub Date : 2025-06-24 DOI:10.1016/j.osnem.2025.100319

Lucio La Cava, Andrea Tagarelli

{"title":"Safeguarding Decentralized Social Media: LLM Agents for Automating Community Rule Compliance","authors":"Lucio La Cava, Andrea Tagarelli","doi":"10.1016/j.osnem.2025.100319","DOIUrl":null,"url":null,"abstract":"<div><div>Ensuring content compliance with community guidelines is crucial for maintaining healthy online social environments. However, traditional human-based compliance checking struggles with scaling due to the increasing volume of user-generated content and a limited number of moderators. Recent advancements in Natural Language Understanding demonstrated by Large Language Models unlock new opportunities for automated content compliance verification. This work evaluates six AI-agents built on Open-LLMs for automated rule compliance checking in Decentralized Social Networks, a challenging environment due to heterogeneous community scopes and rules. By analyzing over 50,000 posts from hundreds of Mastodon servers, we find that AI-agents effectively detect non-compliant content, grasp linguistic subtleties, and adapt to diverse community contexts. Most agents also show high inter-rater reliability and consistency in score justification and suggestions for compliance. Human-based evaluation with domain experts confirmed the agents’ reliability and usefulness, rendering them promising tools for semi-automated or human-in-the-loop content moderation systems.</div><div><em>Warning: This manuscript may contain sensitive content as it quotes harmful/hateful social media posts.</em></div></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":"48 ","pages":"Article 100319"},"PeriodicalIF":2.9000,"publicationDate":"2025-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Online Social Networks and Media","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2468696425000205","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Social Sciences","Score":null,"Total":0}

引用次数: 0

Abstract

Ensuring content compliance with community guidelines is crucial for maintaining healthy online social environments. However, traditional human-based compliance checking struggles with scaling due to the increasing volume of user-generated content and a limited number of moderators. Recent advancements in Natural Language Understanding demonstrated by Large Language Models unlock new opportunities for automated content compliance verification. This work evaluates six AI-agents built on Open-LLMs for automated rule compliance checking in Decentralized Social Networks, a challenging environment due to heterogeneous community scopes and rules. By analyzing over 50,000 posts from hundreds of Mastodon servers, we find that AI-agents effectively detect non-compliant content, grasp linguistic subtleties, and adapt to diverse community contexts. Most agents also show high inter-rater reliability and consistency in score justification and suggestions for compliance. Human-based evaluation with domain experts confirmed the agents’ reliability and usefulness, rendering them promising tools for semi-automated or human-in-the-loop content moderation systems.

Warning: This manuscript may contain sensitive content as it quotes harmful/hateful social media posts.

查看原文本刊更多论文

保护分散的社交媒体：自动化社区规则遵从的LLM代理

确保内容符合社区准则对于维护健康的在线社交环境至关重要。然而，由于用户生成内容的数量不断增加和审核员数量有限，传统的基于人工的合规性检查难以扩展。最近在自然语言理解方面的进步由大型语言模型所证明，这为自动化内容遵从性验证打开了新的机会。这项工作评估了建立在open - llm上的六个人工智能代理，用于在分散的社交网络中进行自动规则遵从性检查，这是一个具有挑战性的环境，由于不同的社区范围和规则。通过分析来自数百个Mastodon服务器的50,000多条帖子，我们发现ai代理可以有效地检测不合规内容，掌握语言的微妙之处，并适应不同的社区环境。大多数代理人在评分证明和依从性建议方面也表现出较高的评分者间信度和一致性。由领域专家进行的基于人的评估证实了代理的可靠性和有用性，使它们成为半自动或人在环内容审核系统的有前途的工具。警告：此手稿可能包含敏感内容，因为它引用了有害/仇恨的社交媒体帖子。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊