个体顺序数据处理的信息论方法

Luca Bonomi, Liyue Fan, Hongxia Jin
{"title":"个体顺序数据处理的信息论方法","authors":"Luca Bonomi, Liyue Fan, Hongxia Jin","doi":"10.1145/2835776.2835828","DOIUrl":null,"url":null,"abstract":"Fine-grained, personal data has been largely, continuously generated nowadays, such as location check-ins, web histories, physical activities, etc. Those data sequences are typically shared with untrusted parties for data analysis and promotional services. However, the individually-generated sequential data contains behavior patterns and may disclose sensitive information if not properly sanitized. Furthermore, the utility of the released sequence can be adversely affected by sanitization techniques. In this paper, we study the problem of individual sequence data sanitization with minimum utility loss, given user-specified sensitive patterns. We propose a privacy notion based on information theory and sanitize sequence data via generalization. We show the optimization problem is hard and develop two efficient heuristic solutions. Extensive experimental evaluations are conducted on real-world datasets and the results demonstrate the efficiency and effectiveness of our solutions.","PeriodicalId":20567,"journal":{"name":"Proceedings of the Ninth ACM International Conference on Web Search and Data Mining","volume":"20 5 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2016-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"18","resultStr":"{\"title\":\"An Information-Theoretic Approach to Individual Sequential Data Sanitization\",\"authors\":\"Luca Bonomi, Liyue Fan, Hongxia Jin\",\"doi\":\"10.1145/2835776.2835828\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Fine-grained, personal data has been largely, continuously generated nowadays, such as location check-ins, web histories, physical activities, etc. Those data sequences are typically shared with untrusted parties for data analysis and promotional services. However, the individually-generated sequential data contains behavior patterns and may disclose sensitive information if not properly sanitized. Furthermore, the utility of the released sequence can be adversely affected by sanitization techniques. In this paper, we study the problem of individual sequence data sanitization with minimum utility loss, given user-specified sensitive patterns. We propose a privacy notion based on information theory and sanitize sequence data via generalization. We show the optimization problem is hard and develop two efficient heuristic solutions. Extensive experimental evaluations are conducted on real-world datasets and the results demonstrate the efficiency and effectiveness of our solutions.\",\"PeriodicalId\":20567,\"journal\":{\"name\":\"Proceedings of the Ninth ACM International Conference on Web Search and Data Mining\",\"volume\":\"20 5 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-02-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"18\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Ninth ACM International Conference on Web Search and Data Mining\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2835776.2835828\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Ninth ACM International Conference on Web Search and Data Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2835776.2835828","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 18

摘要

如今,细粒度的个人数据在很大程度上不断产生,比如位置签到、网络历史记录、身体活动等。这些数据序列通常与不受信任的各方共享,用于数据分析和推广服务。但是,单独生成的顺序数据包含行为模式,如果不进行适当的处理,可能会泄露敏感信息。此外,释放序列的效用可能受到消毒技术的不利影响。在给定用户指定的敏感模式下,我们研究了效用损失最小的单个序列数据处理问题。我们提出了一种基于信息论的隐私概念,并通过泛化对序列数据进行净化。我们证明了优化问题是困难的,并开发了两个有效的启发式解决方案。在真实世界的数据集上进行了广泛的实验评估,结果证明了我们的解决方案的效率和有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
An Information-Theoretic Approach to Individual Sequential Data Sanitization
Fine-grained, personal data has been largely, continuously generated nowadays, such as location check-ins, web histories, physical activities, etc. Those data sequences are typically shared with untrusted parties for data analysis and promotional services. However, the individually-generated sequential data contains behavior patterns and may disclose sensitive information if not properly sanitized. Furthermore, the utility of the released sequence can be adversely affected by sanitization techniques. In this paper, we study the problem of individual sequence data sanitization with minimum utility loss, given user-specified sensitive patterns. We propose a privacy notion based on information theory and sanitize sequence data via generalization. We show the optimization problem is hard and develop two efficient heuristic solutions. Extensive experimental evaluations are conducted on real-world datasets and the results demonstrate the efficiency and effectiveness of our solutions.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信