Privacy aware data generation for testing database applications

Xintao Wu, Chintan Sanghvi, Yongge Wang, Yuliang Zheng
{"title":"Privacy aware data generation for testing database applications","authors":"Xintao Wu, Chintan Sanghvi, Yongge Wang, Yuliang Zheng","doi":"10.1109/IDEAS.2005.45","DOIUrl":null,"url":null,"abstract":"Testing of database applications is of great importance. A significant issue in database application testing consists in the availability of representative data. In this paper, we investigate the problem of generating a synthetic database based on a-priori knowledge about a production database. Our approach is to fit general location model using various characteristics (e.g., constraints, statistics, rules) extracted from the production database and then generate the synthetic data using model learnt. The generated data is valid and similar to real data in terms of statistical distribution, hence it can be used for functional and performance testing. As characteristics extracted may contain information which may be used by attacker to derive some confidential information about individuals, we present our disclosure analysis method which applies cell suppression technique for identity disclosure analysis and perturbation for value disclosure.","PeriodicalId":357591,"journal":{"name":"9th International Database Engineering & Application Symposium (IDEAS'05)","volume":"236 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2005-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"9th International Database Engineering & Application Symposium (IDEAS'05)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IDEAS.2005.45","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 14

Abstract

Testing of database applications is of great importance. A significant issue in database application testing consists in the availability of representative data. In this paper, we investigate the problem of generating a synthetic database based on a-priori knowledge about a production database. Our approach is to fit general location model using various characteristics (e.g., constraints, statistics, rules) extracted from the production database and then generate the synthetic data using model learnt. The generated data is valid and similar to real data in terms of statistical distribution, hence it can be used for functional and performance testing. As characteristics extracted may contain information which may be used by attacker to derive some confidential information about individuals, we present our disclosure analysis method which applies cell suppression technique for identity disclosure analysis and perturbation for value disclosure.
用于测试数据库应用程序的隐私感知数据生成
数据库应用程序的测试非常重要。数据库应用程序测试中的一个重要问题在于代表性数据的可用性。本文研究了基于生产数据库的先验知识生成合成数据库的问题。我们的方法是使用从生产数据库中提取的各种特征(例如,约束,统计,规则)来拟合一般位置模型,然后使用学习到的模型生成合成数据。生成的数据在统计分布方面是有效的,并且与实际数据相似,因此可以用于功能和性能测试。针对提取的特征中可能包含被攻击者用来获取个人机密信息的信息,提出了一种利用细胞抑制技术进行身份披露分析,利用微扰技术进行价值披露的披露分析方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信