{"title":"PAC learning halfspaces in non-interactive local differential privacy model with public unlabeled data","authors":"Jinyan Su , Jinhui Xu , Di Wang","doi":"10.1016/j.jcss.2023.103496","DOIUrl":null,"url":null,"abstract":"<div><p>In this paper, we study the problem of PAC learning halfspaces in the non-interactive local differential privacy<span> model (NLDP). To breach the barrier of exponential sample complexity, previous results studied a relaxed setting where the server has access to some additional public but unlabeled data. We continue in this direction. Specifically, we consider the problem under the standard setting instead of the large margin setting studied before. Under different mild assumptions on the underlying data distribution, we propose two approaches that are based on the Massart noise model and self-supervised learning and show that it is possible to achieve sample complexities that are only linear in the dimension and polynomial in other terms for both private and public data, which significantly improve the previous results. Our methods could also be used for other private PAC learning problems.</span></p></div>","PeriodicalId":50224,"journal":{"name":"Journal of Computer and System Sciences","volume":"141 ","pages":"Article 103496"},"PeriodicalIF":1.1000,"publicationDate":"2023-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computer and System Sciences","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0022000023001010","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BUSINESS, FINANCE","Score":null,"Total":0}
引用次数: 0
Abstract
In this paper, we study the problem of PAC learning halfspaces in the non-interactive local differential privacy model (NLDP). To breach the barrier of exponential sample complexity, previous results studied a relaxed setting where the server has access to some additional public but unlabeled data. We continue in this direction. Specifically, we consider the problem under the standard setting instead of the large margin setting studied before. Under different mild assumptions on the underlying data distribution, we propose two approaches that are based on the Massart noise model and self-supervised learning and show that it is possible to achieve sample complexities that are only linear in the dimension and polynomial in other terms for both private and public data, which significantly improve the previous results. Our methods could also be used for other private PAC learning problems.
期刊介绍:
The Journal of Computer and System Sciences publishes original research papers in computer science and related subjects in system science, with attention to the relevant mathematical theory. Applications-oriented papers may also be accepted and they are expected to contain deep analytic evaluation of the proposed solutions.
Research areas include traditional subjects such as:
• Theory of algorithms and computability
• Formal languages
• Automata theory
Contemporary subjects such as:
• Complexity theory
• Algorithmic Complexity
• Parallel & distributed computing
• Computer networks
• Neural networks
• Computational learning theory
• Database theory & practice
• Computer modeling of complex systems
• Security and Privacy.