保护用户隐私:混淆区分时空足迹

Jinhyung D. Park, E. Seglem, Eric Lin, Andreas Züfle
{"title":"保护用户隐私:混淆区分时空足迹","authors":"Jinhyung D. Park, E. Seglem, Eric Lin, Andreas Züfle","doi":"10.1145/3148150.3148152","DOIUrl":null,"url":null,"abstract":"In recent years, applications that collect and store location data have become ubiquitous, allowing users to engage in a variety of interactions with other users and services in their digital or physical vicinity. However, usage of these geolocation services put users at risk of serious privacy threats. For instance, state-of-the-art user-identification methods use geospatial trajectories derived from location based services to identify users at an alarmingly high accuracy. In this work, we address the problem of protecting user identities by presenting methods for obfuscating discriminative location data in users' profiles. We utilize data provided by the public Twitter API, collecting tweets with geolocation tags from a select group of prolific users in a 12-week time period. To minimize the amount of data obfuscated, we present two methods to identify the most discriminative tweets. The first solution is to use an Entropy-Maximizing Observation Function based on the number of tweets the user has posted and the number of people who have posted in that specific location. This ensures tweets by infrequent users in unique locations are changed first. The other solution is to use the identification algorithm to figure out what users can be identified and only change tweets from those users. For both methods, to perturb a tweet, we move it to a location with more tweets to mask the identity of the user. A thorough experimentation of other baseline approaches shows that our model exhibits a significant decrease in user identification accuracy while keeping the percentage of changed data at a minimum.","PeriodicalId":176579,"journal":{"name":"Proceedings of the 1st ACM SIGSPATIAL Workshop on Recommendations for Location-based Services and Social Networks","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Protecting User Privacy: Obfuscating Discriminative Spatio-Temporal Footprints\",\"authors\":\"Jinhyung D. Park, E. Seglem, Eric Lin, Andreas Züfle\",\"doi\":\"10.1145/3148150.3148152\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In recent years, applications that collect and store location data have become ubiquitous, allowing users to engage in a variety of interactions with other users and services in their digital or physical vicinity. However, usage of these geolocation services put users at risk of serious privacy threats. For instance, state-of-the-art user-identification methods use geospatial trajectories derived from location based services to identify users at an alarmingly high accuracy. In this work, we address the problem of protecting user identities by presenting methods for obfuscating discriminative location data in users' profiles. We utilize data provided by the public Twitter API, collecting tweets with geolocation tags from a select group of prolific users in a 12-week time period. To minimize the amount of data obfuscated, we present two methods to identify the most discriminative tweets. The first solution is to use an Entropy-Maximizing Observation Function based on the number of tweets the user has posted and the number of people who have posted in that specific location. This ensures tweets by infrequent users in unique locations are changed first. The other solution is to use the identification algorithm to figure out what users can be identified and only change tweets from those users. For both methods, to perturb a tweet, we move it to a location with more tweets to mask the identity of the user. A thorough experimentation of other baseline approaches shows that our model exhibits a significant decrease in user identification accuracy while keeping the percentage of changed data at a minimum.\",\"PeriodicalId\":176579,\"journal\":{\"name\":\"Proceedings of the 1st ACM SIGSPATIAL Workshop on Recommendations for Location-based Services and Social Networks\",\"volume\":\"8 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-11-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 1st ACM SIGSPATIAL Workshop on Recommendations for Location-based Services and Social Networks\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3148150.3148152\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 1st ACM SIGSPATIAL Workshop on Recommendations for Location-based Services and Social Networks","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3148150.3148152","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

摘要

近年来,收集和存储位置数据的应用程序变得无处不在,允许用户与他们的数字或物理附近的其他用户和服务进行各种交互。然而,使用这些地理定位服务会让用户面临严重的隐私威胁。例如,最先进的用户识别方法使用来自基于位置的服务的地理空间轨迹来以惊人的高准确性识别用户。在这项工作中,我们通过提出混淆用户配置文件中区分位置数据的方法来解决保护用户身份的问题。我们利用公共Twitter API提供的数据,在12周的时间内从一组高产用户中收集带有地理位置标签的推文。为了最大限度地减少数据混淆,我们提出了两种方法来识别最具歧视性的推文。第一个解决方案是使用基于用户发布的tweet数量和在该特定位置发布的人数的熵最大化观察函数。这确保了不经常使用的用户在独特位置发出的推文首先被更改。另一种解决方案是使用识别算法找出哪些用户可以被识别,并且只更改来自这些用户的tweet。对于这两种方法,为了干扰tweet,我们将其移动到具有更多tweet的位置,以掩盖用户的身份。对其他基线方法的彻底实验表明,我们的模型在用户识别准确性方面显着降低,同时将更改数据的百分比保持在最低水平。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Protecting User Privacy: Obfuscating Discriminative Spatio-Temporal Footprints
In recent years, applications that collect and store location data have become ubiquitous, allowing users to engage in a variety of interactions with other users and services in their digital or physical vicinity. However, usage of these geolocation services put users at risk of serious privacy threats. For instance, state-of-the-art user-identification methods use geospatial trajectories derived from location based services to identify users at an alarmingly high accuracy. In this work, we address the problem of protecting user identities by presenting methods for obfuscating discriminative location data in users' profiles. We utilize data provided by the public Twitter API, collecting tweets with geolocation tags from a select group of prolific users in a 12-week time period. To minimize the amount of data obfuscated, we present two methods to identify the most discriminative tweets. The first solution is to use an Entropy-Maximizing Observation Function based on the number of tweets the user has posted and the number of people who have posted in that specific location. This ensures tweets by infrequent users in unique locations are changed first. The other solution is to use the identification algorithm to figure out what users can be identified and only change tweets from those users. For both methods, to perturb a tweet, we move it to a location with more tweets to mask the identity of the user. A thorough experimentation of other baseline approaches shows that our model exhibits a significant decrease in user identification accuracy while keeping the percentage of changed data at a minimum.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信