错误信息的顺序分类

arXiv - CS - Social and Information Networks Pub Date : 2024-09-07 DOI:arxiv-2409.04860

Daniel Toma, Wasim Huleihel

{"title":"错误信息的顺序分类","authors":"Daniel Toma, Wasim Huleihel","doi":"arxiv-2409.04860","DOIUrl":null,"url":null,"abstract":"In recent years there have been a growing interest in online auditing of\ninformation flow over social networks with the goal of monitoring undesirable\neffects, such as, misinformation and fake news. Most previous work on the\nsubject, focus on the binary classification problem of classifying information\nas fake or genuine. Nonetheless, in many practical scenarios, the\nmulti-class/label setting is of particular importance. For example, it could be\nthe case that a social media platform may want to distinguish between ``true\",\n``partly-true\", and ``false\" information. Accordingly, in this paper, we\nconsider the problem of online multiclass classification of information flow.\nTo that end, driven by empirical studies on information flow over real-world\nsocial media networks, we propose a probabilistic information flow model over\ngraphs. Then, the learning task is to detect the label of the information flow,\nwith the goal of minimizing a combination of the classification error and the\ndetection time. For this problem, we propose two detection algorithms; the\nfirst is based on the well-known multiple sequential probability ratio test,\nwhile the second is a novel graph neural network based sequential decision\nalgorithm. For both algorithms, we prove several strong statistical guarantees.\nWe also construct a data driven algorithm for learning the proposed\nprobabilistic model. Finally, we test our algorithms over two real-world\ndatasets, and show that they outperform other state-of-the-art misinformation\ndetection algorithms, in terms of detection time and classification error.","PeriodicalId":501032,"journal":{"name":"arXiv - CS - Social and Information Networks","volume":"2 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Sequential Classification of Misinformation\",\"authors\":\"Daniel Toma, Wasim Huleihel\",\"doi\":\"arxiv-2409.04860\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In recent years there have been a growing interest in online auditing of\\ninformation flow over social networks with the goal of monitoring undesirable\\neffects, such as, misinformation and fake news. Most previous work on the\\nsubject, focus on the binary classification problem of classifying information\\nas fake or genuine. Nonetheless, in many practical scenarios, the\\nmulti-class/label setting is of particular importance. For example, it could be\\nthe case that a social media platform may want to distinguish between ``true\\\",\\n``partly-true\\\", and ``false\\\" information. Accordingly, in this paper, we\\nconsider the problem of online multiclass classification of information flow.\\nTo that end, driven by empirical studies on information flow over real-world\\nsocial media networks, we propose a probabilistic information flow model over\\ngraphs. Then, the learning task is to detect the label of the information flow,\\nwith the goal of minimizing a combination of the classification error and the\\ndetection time. For this problem, we propose two detection algorithms; the\\nfirst is based on the well-known multiple sequential probability ratio test,\\nwhile the second is a novel graph neural network based sequential decision\\nalgorithm. For both algorithms, we prove several strong statistical guarantees.\\nWe also construct a data driven algorithm for learning the proposed\\nprobabilistic model. Finally, we test our algorithms over two real-world\\ndatasets, and show that they outperform other state-of-the-art misinformation\\ndetection algorithms, in terms of detection time and classification error.\",\"PeriodicalId\":501032,\"journal\":{\"name\":\"arXiv - CS - Social and Information Networks\",\"volume\":\"2 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Social and Information Networks\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.04860\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Social and Information Networks","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.04860","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

近年来，人们对社交网络信息流在线审计的兴趣与日俱增，其目标是监控不良影响，如错误信息和假新闻。以前关于这一主题的大部分工作都集中在信息真假的二元分类问题上。然而，在许多实际场景中，多类别/标签设置尤为重要。例如，社交媒体平台可能希望区分 "真实"、"部分真实 "和 "虚假 "信息。因此，在本文中，我们考虑了信息流的在线多类分类问题。为此，在对现实世界社交媒体网络上的信息流进行实证研究的基础上，我们提出了一个图上的概率信息流模型。然后，学习任务是检测信息流的标签，目标是最小化分类误差和检测时间的组合。针对这个问题，我们提出了两种检测算法：第一种是基于著名的多序列概率比检验，第二种是基于新型图神经网络的序列判定算法。我们还构建了一种数据驱动算法，用于学习所提出的概率模型。最后，我们在两个真实世界数据集上测试了我们的算法，结果表明它们在检测时间和分类误差方面优于其他最先进的错误信息检测算法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Sequential Classification of Misinformation

In recent years there have been a growing interest in online auditing of information flow over social networks with the goal of monitoring undesirable effects, such as, misinformation and fake news. Most previous work on the subject, focus on the binary classification problem of classifying information as fake or genuine. Nonetheless, in many practical scenarios, the multi-class/label setting is of particular importance. For example, it could be the case that a social media platform may want to distinguish between ``true", ``partly-true", and ``false" information. Accordingly, in this paper, we consider the problem of online multiclass classification of information flow. To that end, driven by empirical studies on information flow over real-world social media networks, we propose a probabilistic information flow model over graphs. Then, the learning task is to detect the label of the information flow, with the goal of minimizing a combination of the classification error and the detection time. For this problem, we propose two detection algorithms; the first is based on the well-known multiple sequential probability ratio test, while the second is a novel graph neural network based sequential decision algorithm. For both algorithms, we prove several strong statistical guarantees. We also construct a data driven algorithm for learning the proposed probabilistic model. Finally, we test our algorithms over two real-world datasets, and show that they outperform other state-of-the-art misinformation detection algorithms, in terms of detection time and classification error.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

arXiv - CS - Social and Information Networks

自引率

0.00%

发文量