How Well Do My Results Generalize? Comparing Security and Privacy Survey Results from MTurk, Web, and Telephone Samples

Elissa M. Redmiles, Sean Kross, Michelle L. Mazurek
{"title":"How Well Do My Results Generalize? Comparing Security and Privacy Survey Results from MTurk, Web, and Telephone Samples","authors":"Elissa M. Redmiles, Sean Kross, Michelle L. Mazurek","doi":"10.1109/SP.2019.00014","DOIUrl":null,"url":null,"abstract":"Security and privacy researchers often rely on data collected from Amazon Mechanical Turk (MTurk) to evaluate security tools, to understand users' privacy preferences and to measure online behavior. Yet, little is known about how well Turkers' survey responses and performance on security- and privacy-related tasks generalizes to a broader population. This paper takes a first step toward understanding the generalizability of security and privacy user studies by comparing users' self-reports of their security and privacy knowledge, past experiences, advice sources, and behavior across samples collected using MTurk (n=480), a census-representative web-panel (n=428), and a probabilistic telephone sample (n=3,000) statistically weighted to be accurate within 2.7% of the true prevalence in the U.S. Surprisingly, the results suggest that: (1) MTurk responses regarding security and privacy experiences, advice sources, and knowledge are more representative of the U.S. population than are responses from the census-representative panel; (2) MTurk and general population reports of security and privacy experiences, knowledge, and advice sources are quite similar for respondents who are younger than 50 or who have some college education; and (3) respondents' answers to the survey questions we ask are stable over time and robust to relevant, broadly-reported news events. Further, differences in responses cannot be ameliorated with simple demographic weighting, possibly because MTurk and panel participants have more internet experience compared to their demographic peers. Together, these findings lend tempered support for the generalizability of prior crowdsourced security and privacy user studies; provide context to more accurately interpret the results of such studies; and suggest rich directions for future work to mitigate experience- rather than demographic-related sample biases.","PeriodicalId":272713,"journal":{"name":"2019 IEEE Symposium on Security and Privacy (SP)","volume":"69 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"158","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE Symposium on Security and Privacy (SP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SP.2019.00014","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 158

Abstract

Security and privacy researchers often rely on data collected from Amazon Mechanical Turk (MTurk) to evaluate security tools, to understand users' privacy preferences and to measure online behavior. Yet, little is known about how well Turkers' survey responses and performance on security- and privacy-related tasks generalizes to a broader population. This paper takes a first step toward understanding the generalizability of security and privacy user studies by comparing users' self-reports of their security and privacy knowledge, past experiences, advice sources, and behavior across samples collected using MTurk (n=480), a census-representative web-panel (n=428), and a probabilistic telephone sample (n=3,000) statistically weighted to be accurate within 2.7% of the true prevalence in the U.S. Surprisingly, the results suggest that: (1) MTurk responses regarding security and privacy experiences, advice sources, and knowledge are more representative of the U.S. population than are responses from the census-representative panel; (2) MTurk and general population reports of security and privacy experiences, knowledge, and advice sources are quite similar for respondents who are younger than 50 or who have some college education; and (3) respondents' answers to the survey questions we ask are stable over time and robust to relevant, broadly-reported news events. Further, differences in responses cannot be ameliorated with simple demographic weighting, possibly because MTurk and panel participants have more internet experience compared to their demographic peers. Together, these findings lend tempered support for the generalizability of prior crowdsourced security and privacy user studies; provide context to more accurately interpret the results of such studies; and suggest rich directions for future work to mitigate experience- rather than demographic-related sample biases.
我的结果有多好概括?比较来自MTurk、Web和电话样本的安全和隐私调查结果
安全和隐私研究人员经常依靠从亚马逊土耳其机器人(MTurk)收集的数据来评估安全工具,了解用户的隐私偏好,并衡量在线行为。然而,对于Turkers在安全和隐私相关任务上的调查反应和表现在更广泛的人群中有多普遍,我们知之甚少。本文通过比较使用MTurk (n=480)、人口普查代表性网络面板(n=428)和概率电话样本(n= 3000)收集的样本中用户对其安全和隐私知识、过去经验、建议来源和行为的自我报告,向理解安全和隐私用户研究的普遍性迈出了第一步,这些样本的统计加权在美国真实患病率的2.7%内准确。令人惊讶的是,结果表明:(1)土耳其人关于安全和隐私经验、建议来源和知识的回答比人口普查代表小组的回答更能代表美国人口;(2)对于年龄在50岁以下或受过一些大学教育的受访者来说,MTurk和一般人群在安全和隐私经验、知识和建议来源方面的报告非常相似;(3)受访者对我们提出的调查问题的回答随着时间的推移是稳定的,并且对于相关的、广泛报道的新闻事件是稳健的。此外,反应的差异不能通过简单的人口加权来改善,可能是因为MTurk和小组参与者比他们的人口统计学同龄人有更多的互联网经验。总之,这些发现为先前众包安全和隐私用户研究的普遍性提供了有限的支持;提供背景,以便更准确地解释此类研究的结果;并为未来的工作提出丰富的方向,以减轻经验,而不是人口统计相关的样本偏差。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信