Analyzing User Perspectives on Mobile App Privacy at Scale

Preksha Nema, Pauline Anthonysamy, N. Taft, Sai Teia Peddinti
{"title":"Analyzing User Perspectives on Mobile App Privacy at Scale","authors":"Preksha Nema, Pauline Anthonysamy, N. Taft, Sai Teia Peddinti","doi":"10.1145/3510003.3510079","DOIUrl":null,"url":null,"abstract":"In this paper we present a methodology to analyze users‘ con-cerns and perspectives about privacy at scale. We leverage NLP techniques to process millions of mobile app reviews and extract privacy concerns. Our methodology is composed of a binary clas-sifier that distinguishes between privacy and non-privacy related reviews. We use clustering to gather reviews that discuss similar privacy concerns, and employ summarization metrics to extract representative reviews to summarize each cluster. We apply our methods on 287M reviews for about 2M apps across the 29 cate-gories in Google Play to identify top privacy pain points in mobile apps. We identified approximately 440K privacy related reviews. We find that privacy related reviews occur in all 29 categories, with some issues arising across numerous app categories and other issues only surfacing in a small set of app categories. We show empirical evidence that confirms dominant privacy themes - concerns about apps requesting unnecessary permissions, collection of personal information, frustration with privacy controls, tracking and the selling of personal data. As far as we know, this is the first large scale analysis to confirm these findings based on hundreds of thousands of user inputs. We also observe some unexpected findings such as users warning each other not to install an app due to privacy issues, users uninstalling apps due to privacy reasons, as well as positive reviews that reward developers for privacy friendly apps. Finally we discuss the implications of our method and findings for developers and app stores.","PeriodicalId":202896,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3510003.3510079","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8

Abstract

In this paper we present a methodology to analyze users‘ con-cerns and perspectives about privacy at scale. We leverage NLP techniques to process millions of mobile app reviews and extract privacy concerns. Our methodology is composed of a binary clas-sifier that distinguishes between privacy and non-privacy related reviews. We use clustering to gather reviews that discuss similar privacy concerns, and employ summarization metrics to extract representative reviews to summarize each cluster. We apply our methods on 287M reviews for about 2M apps across the 29 cate-gories in Google Play to identify top privacy pain points in mobile apps. We identified approximately 440K privacy related reviews. We find that privacy related reviews occur in all 29 categories, with some issues arising across numerous app categories and other issues only surfacing in a small set of app categories. We show empirical evidence that confirms dominant privacy themes - concerns about apps requesting unnecessary permissions, collection of personal information, frustration with privacy controls, tracking and the selling of personal data. As far as we know, this is the first large scale analysis to confirm these findings based on hundreds of thousands of user inputs. We also observe some unexpected findings such as users warning each other not to install an app due to privacy issues, users uninstalling apps due to privacy reasons, as well as positive reviews that reward developers for privacy friendly apps. Finally we discuss the implications of our method and findings for developers and app stores.
大规模分析用户对移动应用隐私的看法
在本文中,我们提出了一种方法来分析用户对隐私的关注和观点。我们利用NLP技术处理数以百万计的移动应用评论,并提取隐私问题。我们的方法是由一个二元分类器,区分隐私和非隐私相关的评论。我们使用聚类来收集讨论类似隐私问题的评论,并使用总结度量来提取有代表性的评论来总结每个集群。我们对b谷歌Play 29个类别中约200万款应用的2.87万条评论进行了分析,以确定手机应用的主要隐私痛点。我们发现了大约440K条与隐私相关的评论。我们发现,与隐私相关的评论出现在所有29个应用类别中,有些问题出现在许多应用类别中,而其他问题只出现在一小部分应用类别中。我们展示的经验证据证实了主要的隐私主题——对应用程序请求不必要的权限、收集个人信息、对隐私控制的失望、跟踪和销售个人数据的担忧。据我们所知,这是基于数十万用户输入的第一次大规模分析来证实这些发现。我们还观察到一些意想不到的发现,比如用户会因为隐私问题互相警告不要安装应用,用户会因为隐私原因卸载应用,以及积极的评价奖励隐私友好型应用的开发者。最后,我们讨论了我们的方法和发现对开发者和应用商店的启示。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信