超越试错：在适度干预后预测用户放弃

IF 8 2区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

Engineering Applications of Artificial Intelligence Pub Date : 2025-09-27 DOI:10.1016/j.engappai.2025.112375

Benedetta Tessa , Lorenzo Cima , Amaury Trujillo , Marco Avvenuti , Stefano Cresci

{"title":"超越试错：在适度干预后预测用户放弃","authors":"Benedetta Tessa , Lorenzo Cima , Amaury Trujillo , Marco Avvenuti , Stefano Cresci","doi":"10.1016/j.engappai.2025.112375","DOIUrl":null,"url":null,"abstract":"<div><div>Current content moderation follows a reactive, trial-and-error approach, where interventions are applied and their effects are only measured post-hoc. In contrast, we introduce a proactive, predictive approach that enables moderators to anticipate the impact of their actions before implementation. We propose and tackle the new task of predicting user abandonment following a moderation intervention. We study the reactions of 16,540 users to a massive ban of online communities on Reddit, training a set of binary classifiers to identify those users who would abandon the platform after the intervention—a problem of great practical relevance. We leverage a dataset of 13.8 million posts to compute a large and diverse set of 142 features, which convey information about the activity, toxicity, relations, and writing style of the users. We obtain promising results, with the best-performing model achieving <em>micro F1-score</em> <span><math><mrow><mo>=</mo><mn>0</mn><mo>.</mo><mn>914</mn></mrow></math></span>. Our model shows robust generalizability when applied to users from previously unseen communities. Furthermore, we identify activity features as the most informative predictors, followed by relational and toxicity features, while writing style features exhibit limited utility. Theoretically, our results demonstrate the feasibility of adopting a predictive machine learning approach to estimate the effects of moderation interventions. Practically, this work marks a fundamental shift from reactive to predictive moderation, equipping platform administrators with intelligent tools to strategically plan interventions, minimize unintended consequences, and optimize user engagement.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"162 ","pages":"Article 112375"},"PeriodicalIF":8.0000,"publicationDate":"2025-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Beyond trial-and-error: Predicting user abandonment after a moderation intervention\",\"authors\":\"Benedetta Tessa , Lorenzo Cima , Amaury Trujillo , Marco Avvenuti , Stefano Cresci\",\"doi\":\"10.1016/j.engappai.2025.112375\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Current content moderation follows a reactive, trial-and-error approach, where interventions are applied and their effects are only measured post-hoc. In contrast, we introduce a proactive, predictive approach that enables moderators to anticipate the impact of their actions before implementation. We propose and tackle the new task of predicting user abandonment following a moderation intervention. We study the reactions of 16,540 users to a massive ban of online communities on Reddit, training a set of binary classifiers to identify those users who would abandon the platform after the intervention—a problem of great practical relevance. We leverage a dataset of 13.8 million posts to compute a large and diverse set of 142 features, which convey information about the activity, toxicity, relations, and writing style of the users. We obtain promising results, with the best-performing model achieving <em>micro F1-score</em> <span><math><mrow><mo>=</mo><mn>0</mn><mo>.</mo><mn>914</mn></mrow></math></span>. Our model shows robust generalizability when applied to users from previously unseen communities. Furthermore, we identify activity features as the most informative predictors, followed by relational and toxicity features, while writing style features exhibit limited utility. Theoretically, our results demonstrate the feasibility of adopting a predictive machine learning approach to estimate the effects of moderation interventions. Practically, this work marks a fundamental shift from reactive to predictive moderation, equipping platform administrators with intelligent tools to strategically plan interventions, minimize unintended consequences, and optimize user engagement.</div></div>\",\"PeriodicalId\":50523,\"journal\":{\"name\":\"Engineering Applications of Artificial Intelligence\",\"volume\":\"162 \",\"pages\":\"Article 112375\"},\"PeriodicalIF\":8.0000,\"publicationDate\":\"2025-09-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Engineering Applications of Artificial Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0952197625023838\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering Applications of Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0952197625023838","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

目前的内容审核遵循一种反应性的、反复试验的方法，在这种方法中，干预措施被应用，其效果仅在事后测量。相比之下，我们引入了一种主动的、预测性的方法，使版主能够在实施之前预测其行动的影响。我们提出并解决了在适度干预后预测用户放弃的新任务。我们研究了16540名用户对Reddit上大规模禁止在线社区的反应，训练了一组二元分类器来识别那些在干预后会放弃该平台的用户——这是一个具有重大实际意义的问题。我们利用一个包含1380万篇帖子的数据集来计算一个包含142个特征的庞大而多样的集合，这些特征传达了有关用户的活动、毒性、关系和写作风格的信息。我们得到了很好的结果，表现最好的模型达到了micro F1-score =0.914。当应用于以前未见过的社区的用户时，我们的模型显示出强大的泛化性。此外，我们认为活动特征是最具信息量的预测因素，其次是关系特征和毒性特征，而写作风格特征的效用有限。从理论上讲，我们的结果证明了采用预测机器学习方法来估计适度干预效果的可行性。实际上，这项工作标志着从被动调节到预测调节的根本性转变，为平台管理员提供了智能工具，以战略性地规划干预措施，最大限度地减少意外后果，并优化用户参与度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Beyond trial-and-error: Predicting user abandonment after a moderation intervention

Current content moderation follows a reactive, trial-and-error approach, where interventions are applied and their effects are only measured post-hoc. In contrast, we introduce a proactive, predictive approach that enables moderators to anticipate the impact of their actions before implementation. We propose and tackle the new task of predicting user abandonment following a moderation intervention. We study the reactions of 16,540 users to a massive ban of online communities on Reddit, training a set of binary classifiers to identify those users who would abandon the platform after the intervention—a problem of great practical relevance. We leverage a dataset of 13.8 million posts to compute a large and diverse set of 142 features, which convey information about the activity, toxicity, relations, and writing style of the users. We obtain promising results, with the best-performing model achieving micro F1-score

= 0.914

. Our model shows robust generalizability when applied to users from previously unseen communities. Furthermore, we identify activity features as the most informative predictors, followed by relational and toxicity features, while writing style features exhibit limited utility. Theoretically, our results demonstrate the feasibility of adopting a predictive machine learning approach to estimate the effects of moderation interventions. Practically, this work marks a fundamental shift from reactive to predictive moderation, equipping platform administrators with intelligent tools to strategically plan interventions, minimize unintended consequences, and optimize user engagement.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Engineering Applications of Artificial Intelligence 工程技术-工程：电子与电气

CiteScore

9.60

自引率

10.00%

发文量

505

审稿时长

68 days

期刊介绍： Artificial Intelligence (AI) is pivotal in driving the fourth industrial revolution, witnessing remarkable advancements across various machine learning methodologies. AI techniques have become indispensable tools for practicing engineers, enabling them to tackle previously insurmountable challenges. Engineering Applications of Artificial Intelligence serves as a global platform for the swift dissemination of research elucidating the practical application of AI methods across all engineering disciplines. Submitted papers are expected to present novel aspects of AI utilized in real-world engineering applications, validated using publicly available datasets to ensure the replicability of research outcomes. Join us in exploring the transformative potential of AI in engineering.