{"title":"在差异隐私条件下通过加权滑动窗口进行流式直方图发布","authors":"Xiujun Wang;Lei Mo;Xiao Zheng;Zhe Dang","doi":"10.26599/TST.2023.9010083","DOIUrl":null,"url":null,"abstract":"Continuously publishing histograms in data streams is crucial to many real-time applications, as it provides not only critical statistical information, but also reduces privacy leaking risk. As the importance of elements usually decreases over time in data streams, in this paper we model a data stream by a sequence of weighted sliding windows, and then study how to publish histograms over these windows continuously. The existing literature can hardly solve this problem in a real-time way, because they need to buffer all elements in each sliding window, resulting in high computational overhead and prohibitive storage burden. In this paper, we overcome this drawback by proposing an online algorithm denoted by Efficient Streaming Histogram Publishing (ESHP) to continuously publish histograms over weighted sliding windows. Specifically, our method first creates a novel sketching structure, called Approximate-Estimate Sketch (AESketch), to maintain the counting information of each histogram interval at every time instance; then, it creates histograms that satisfy the differential privacy requirement by smartly adding appropriate noise values into the sketching structure. Extensive experimental results and rigorous theoretical analysis demonstrate that the ESHP method can offer equivalent data utility with significantly lower computational overhead and storage costs when compared to other existing methods.","PeriodicalId":48690,"journal":{"name":"Tsinghua Science and Technology","volume":"29 6","pages":"1674-1693"},"PeriodicalIF":6.6000,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10566026","citationCount":"0","resultStr":"{\"title\":\"Streaming Histogram Publication Over Weighted Sliding Windows Under Differential Privacy\",\"authors\":\"Xiujun Wang;Lei Mo;Xiao Zheng;Zhe Dang\",\"doi\":\"10.26599/TST.2023.9010083\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Continuously publishing histograms in data streams is crucial to many real-time applications, as it provides not only critical statistical information, but also reduces privacy leaking risk. As the importance of elements usually decreases over time in data streams, in this paper we model a data stream by a sequence of weighted sliding windows, and then study how to publish histograms over these windows continuously. The existing literature can hardly solve this problem in a real-time way, because they need to buffer all elements in each sliding window, resulting in high computational overhead and prohibitive storage burden. In this paper, we overcome this drawback by proposing an online algorithm denoted by Efficient Streaming Histogram Publishing (ESHP) to continuously publish histograms over weighted sliding windows. Specifically, our method first creates a novel sketching structure, called Approximate-Estimate Sketch (AESketch), to maintain the counting information of each histogram interval at every time instance; then, it creates histograms that satisfy the differential privacy requirement by smartly adding appropriate noise values into the sketching structure. Extensive experimental results and rigorous theoretical analysis demonstrate that the ESHP method can offer equivalent data utility with significantly lower computational overhead and storage costs when compared to other existing methods.\",\"PeriodicalId\":48690,\"journal\":{\"name\":\"Tsinghua Science and Technology\",\"volume\":\"29 6\",\"pages\":\"1674-1693\"},\"PeriodicalIF\":6.6000,\"publicationDate\":\"2024-06-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10566026\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Tsinghua Science and Technology\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10566026/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"Multidisciplinary\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Tsinghua Science and Technology","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10566026/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Multidisciplinary","Score":null,"Total":0}
Streaming Histogram Publication Over Weighted Sliding Windows Under Differential Privacy
Continuously publishing histograms in data streams is crucial to many real-time applications, as it provides not only critical statistical information, but also reduces privacy leaking risk. As the importance of elements usually decreases over time in data streams, in this paper we model a data stream by a sequence of weighted sliding windows, and then study how to publish histograms over these windows continuously. The existing literature can hardly solve this problem in a real-time way, because they need to buffer all elements in each sliding window, resulting in high computational overhead and prohibitive storage burden. In this paper, we overcome this drawback by proposing an online algorithm denoted by Efficient Streaming Histogram Publishing (ESHP) to continuously publish histograms over weighted sliding windows. Specifically, our method first creates a novel sketching structure, called Approximate-Estimate Sketch (AESketch), to maintain the counting information of each histogram interval at every time instance; then, it creates histograms that satisfy the differential privacy requirement by smartly adding appropriate noise values into the sketching structure. Extensive experimental results and rigorous theoretical analysis demonstrate that the ESHP method can offer equivalent data utility with significantly lower computational overhead and storage costs when compared to other existing methods.
期刊介绍:
Tsinghua Science and Technology (Tsinghua Sci Technol) started publication in 1996. It is an international academic journal sponsored by Tsinghua University and is published bimonthly. This journal aims at presenting the up-to-date scientific achievements in computer science, electronic engineering, and other IT fields. Contributions all over the world are welcome.