{"title":"Differentially Private Boxplots","authors":"Kelly Ramsay, Jairo Diaz-Rodriguez","doi":"arxiv-2405.20415","DOIUrl":null,"url":null,"abstract":"Despite the potential of differentially private data visualization to\nharmonize data analysis and privacy, research in this area remains relatively\nunderdeveloped. Boxplots are a widely popular visualization used for\nsummarizing a dataset and for comparison of multiple datasets. Consequentially,\nwe introduce a differentially private boxplot. We evaluate its effectiveness\nfor displaying location, scale, skewness and tails of a given empirical\ndistribution. In our theoretical exposition, we show that the location and\nscale of the boxplot are estimated with optimal sample complexity, and the\nskewness and tails are estimated consistently. In simulations, we show that\nthis boxplot performs similarly to a non-private boxplot, and it outperforms a\nboxplot naively constructed from existing differentially private quantile\nalgorithms. Additionally, we conduct a real data analysis of Airbnb listings,\nwhich shows that comparable analysis can be achieved through differentially\nprivate boxplot visualization.","PeriodicalId":501323,"journal":{"name":"arXiv - STAT - Other Statistics","volume":"56 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - STAT - Other Statistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2405.20415","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Despite the potential of differentially private data visualization to
harmonize data analysis and privacy, research in this area remains relatively
underdeveloped. Boxplots are a widely popular visualization used for
summarizing a dataset and for comparison of multiple datasets. Consequentially,
we introduce a differentially private boxplot. We evaluate its effectiveness
for displaying location, scale, skewness and tails of a given empirical
distribution. In our theoretical exposition, we show that the location and
scale of the boxplot are estimated with optimal sample complexity, and the
skewness and tails are estimated consistently. In simulations, we show that
this boxplot performs similarly to a non-private boxplot, and it outperforms a
boxplot naively constructed from existing differentially private quantile
algorithms. Additionally, we conduct a real data analysis of Airbnb listings,
which shows that comparable analysis can be achieved through differentially
private boxplot visualization.