François t'Serstevens, Roberto Cerina, Giulia Piccillo
{"title":"Fake News Detection via Wisdom of Synthetic & Representative Crowds","authors":"François t'Serstevens, Roberto Cerina, Giulia Piccillo","doi":"arxiv-2408.03154","DOIUrl":null,"url":null,"abstract":"Social media companies have struggled to provide a democratically legitimate\ndefinition of \"Fake News\". Reliance on expert judgment has attracted criticism\ndue to a general trust deficit and political polarisation. Approaches reliant\non the ``wisdom of the crowds'' are a cost-effective, transparent and inclusive\nalternative. This paper provides a novel end-to-end methodology to detect fake\nnews on X via \"wisdom of the synthetic & representative crowds\". We deploy an\nonline survey on the Lucid platform to gather veracity assessments for a number\nof pandemic-related tweets from crowd-workers. Borrowing from the MrP\nliterature, we train a Hierarchical Bayesian model to predict the veracity of\neach tweet from the perspective of different personae from the population of\ninterest. We then weight the predicted veracity assessments according to a\nrepresentative stratification frame, such that decisions about ``fake'' tweets\nare representative of the overall polity of interest. Based on these aggregated\nscores, we analyse a corpus of tweets and perform a second MrP to generate\nstate-level estimates of the number of people who share fake news. We find\nsmall but statistically meaningful heterogeneity in fake news sharing across US\nstates. At the individual-level: i. sharing fake news is generally rare, with\nan average sharing probability interval [0.07,0.14]; ii. strong evidence that\nDemocrats share less fake news, accounting for a reduction in the sharing odds\nof [57.3%,3.9%] relative to the average user; iii. when Republican definitions\nof fake news are used, it is the latter who show a decrease in the propensity\nto share fake news worth [50.8%, 2.0%]; iv. some evidence that women share less\nfake news than men, an effect worth a [29.5%,4.9%] decrease.","PeriodicalId":501112,"journal":{"name":"arXiv - CS - Computers and Society","volume":"89 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Computers and Society","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.03154","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Social media companies have struggled to provide a democratically legitimate
definition of "Fake News". Reliance on expert judgment has attracted criticism
due to a general trust deficit and political polarisation. Approaches reliant
on the ``wisdom of the crowds'' are a cost-effective, transparent and inclusive
alternative. This paper provides a novel end-to-end methodology to detect fake
news on X via "wisdom of the synthetic & representative crowds". We deploy an
online survey on the Lucid platform to gather veracity assessments for a number
of pandemic-related tweets from crowd-workers. Borrowing from the MrP
literature, we train a Hierarchical Bayesian model to predict the veracity of
each tweet from the perspective of different personae from the population of
interest. We then weight the predicted veracity assessments according to a
representative stratification frame, such that decisions about ``fake'' tweets
are representative of the overall polity of interest. Based on these aggregated
scores, we analyse a corpus of tweets and perform a second MrP to generate
state-level estimates of the number of people who share fake news. We find
small but statistically meaningful heterogeneity in fake news sharing across US
states. At the individual-level: i. sharing fake news is generally rare, with
an average sharing probability interval [0.07,0.14]; ii. strong evidence that
Democrats share less fake news, accounting for a reduction in the sharing odds
of [57.3%,3.9%] relative to the average user; iii. when Republican definitions
of fake news are used, it is the latter who show a decrease in the propensity
to share fake news worth [50.8%, 2.0%]; iv. some evidence that women share less
fake news than men, an effect worth a [29.5%,4.9%] decrease.