Can you Trust the Trend?: Discovering Simpson's Paradoxes in Social Data

N. Alipourfard, Peter G. Fennell, Kristina Lerman
{"title":"Can you Trust the Trend?: Discovering Simpson's Paradoxes in Social Data","authors":"N. Alipourfard, Peter G. Fennell, Kristina Lerman","doi":"10.1145/3159652.3159684","DOIUrl":null,"url":null,"abstract":"We investigate how Simpson»s paradox affects analysis of trends in social data. According to the paradox, the trends observed in data that has been aggregated over an entire population may be different from, and even opposite to, those of the underlying subgroups. Failure to take this effect into account can lead analysis to wrong conclusions. We present a statistical method to automatically identify Simpson»s paradox in data by comparing statistical trends in the aggregate data to those in the disaggregated subgroups. We apply the approach to data from Stack Exchange, a popular question-answering platform, to analyze factors affecting answerer performance, specifically, the likelihood that an answer written by a user will be accepted by the asker as the best answer to his or her question. Our analysis confirms a known Simpson»s paradox and identifies several new instances. These paradoxes provide novel insights into user behavior on Stack Exchange.","PeriodicalId":401247,"journal":{"name":"Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"26","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3159652.3159684","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 26

Abstract

We investigate how Simpson»s paradox affects analysis of trends in social data. According to the paradox, the trends observed in data that has been aggregated over an entire population may be different from, and even opposite to, those of the underlying subgroups. Failure to take this effect into account can lead analysis to wrong conclusions. We present a statistical method to automatically identify Simpson»s paradox in data by comparing statistical trends in the aggregate data to those in the disaggregated subgroups. We apply the approach to data from Stack Exchange, a popular question-answering platform, to analyze factors affecting answerer performance, specifically, the likelihood that an answer written by a user will be accepted by the asker as the best answer to his or her question. Our analysis confirms a known Simpson»s paradox and identifies several new instances. These paradoxes provide novel insights into user behavior on Stack Exchange.
你能相信趋势吗?:发现社会数据中的辛普森悖论
我们调查辛普森悖论如何影响社会数据趋势的分析。根据这一悖论,在对整个人口进行汇总的数据中观察到的趋势可能与潜在子群体的趋势不同,甚至相反。如果不考虑这种影响,分析可能会得出错误的结论。我们提出了一种统计方法,通过比较汇总数据与分解子组的统计趋势来自动识别数据中的辛普森悖论。我们将该方法应用于Stack Exchange(一个流行的问答平台)的数据,以分析影响答题者表现的因素,特别是,由用户编写的答案被提问者接受为他或她问题的最佳答案的可能性。我们的分析证实了一个已知的辛普森悖论,并确定了几个新的例子。这些悖论为Stack Exchange上的用户行为提供了新的见解。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信