{"title":"Exactly Optimal and Communication-Efficient Private Estimation via Block Designs","authors":"Hyun-Young Park;Seung-Hyun Nam;Si-Hyeon Lee","doi":"10.1109/JSAIT.2024.3381195","DOIUrl":"https://doi.org/10.1109/JSAIT.2024.3381195","url":null,"abstract":"In this paper, we propose a new class of local differential privacy (LDP) schemes based on combinatorial block designs for discrete distribution estimation. This class not only recovers many known LDP schemes in a unified framework of combinatorial block design, but also suggests a novel way of finding new schemes achieving the exactly optimal (or near-optimal) privacy-utility trade-off with lower communication costs. Indeed, we find many new LDP schemes that achieve the exactly optimal privacy-utility trade-off, with the minimum communication cost among all the unbiased or consistent schemes, for a certain set of input data size and LDP constraint. Furthermore, to partially solve the sparse existence issue of block design schemes, we consider a broader class of LDP schemes based on regular and pairwise-balanced designs, called RPBD schemes, which relax one of the symmetry requirements on block designs. By considering this broader class of RPBD schemes, we can find LDP schemes achieving near-optimal privacy-utility trade-off with reasonably low communication costs for a much larger set of input data size and LDP constraint.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"5 ","pages":"123-134"},"PeriodicalIF":0.0,"publicationDate":"2024-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140621268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Learning Robust to Distributional Uncertainties and Adversarial Data","authors":"Alireza Sadeghi;Gang Wang;Georgios B. Giannakis","doi":"10.1109/JSAIT.2024.3381869","DOIUrl":"https://doi.org/10.1109/JSAIT.2024.3381869","url":null,"abstract":"Successful training of data-intensive deep neural networks critically rely on vast, clean, and high-quality datasets. In practice however, their reliability diminishes, particularly with noisy, outlier-corrupted data samples encountered in testing. This challenge intensifies when dealing with anonymized, heterogeneous data sets stored across geographically distinct locations due to, e.g., privacy concerns. This present paper introduces robust learning frameworks tailored for centralized and federated learning scenarios. Our goal is to fortify model resilience with a focus that lies in (i) addressing distribution shifts from training to inference time; and, (ii) ensuring test-time robustness, when a trained model may encounter outliers or adversarially contaminated test data samples. To this aim, we start with a centralized setting where the true data distribution is considered unknown, but residing within a Wasserstein ball centered at the empirical distribution. We obtain robust models by minimizing the worst-case expected loss within this ball, yielding an intractable infinite-dimensional optimization problem. Upon leverage the strong duality condition, we arrive at a tractable surrogate learning problem. We develop two stochastic primal-dual algorithms to solve the resultant problem: one for \u0000<inline-formula> <tex-math>$epsilon $ </tex-math></inline-formula>\u0000-accurate convex sub-problems and another for a single gradient ascent step. We further develop a distributionally robust federated learning framework to learn robust model using heterogeneous data sets stored at distinct locations by solving per-learner’s sub-problems locally, offering robustness with modest computational overhead and considering data distribution. Numerical tests corroborate merits of our training algorithms against distributional uncertainties and adversarially corrupted test data samples.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"5 ","pages":"105-122"},"PeriodicalIF":0.0,"publicationDate":"2024-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140619580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exactly Tight Information-Theoretic Generalization Error Bound for the Quadratic Gaussian Problem","authors":"Ruida Zhou;Chao Tian;Tie Liu","doi":"10.1109/JSAIT.2024.3380598","DOIUrl":"https://doi.org/10.1109/JSAIT.2024.3380598","url":null,"abstract":"We provide a new information-theoretic generalization error bound that is exactly tight (i.e., matching even the constant) for the canonical quadratic Gaussian (location) problem. Most existing bounds are order-wise loose in this setting, which has raised concerns about the fundamental capability of information-theoretic bounds in reasoning the generalization behavior for machine learning. The proposed new bound adopts the individual-sample-based approach proposed by Bu et al., but also has several key new ingredients. Firstly, instead of applying the change of measure inequality on the loss function, we apply it to the generalization error function itself; secondly, the bound is derived in a conditional manner; lastly, a reference distribution is introduced. The combination of these components produces a KL-divergence-based generalization error bound. We show that although the latter two new ingredients can help make the bound exactly tight, removing them does not significantly degrade the bound, leading to an asymptotically tight mutual-information-based bound. We further consider the vector Gaussian setting, where a direct application of the proposed bound again does not lead to tight bounds except in special cases. A refined bound is then proposed by a decomposition of loss functions, leading to a tight bound for the vector setting.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"5 ","pages":"94-104"},"PeriodicalIF":0.0,"publicationDate":"2024-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140606029","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Summary Statistic Privacy in Data Sharing","authors":"Zinan Lin;Shuaiqi Wang;Vyas Sekar;Giulia Fanti","doi":"10.1109/JSAIT.2024.3403811","DOIUrl":"https://doi.org/10.1109/JSAIT.2024.3403811","url":null,"abstract":"We study a setting where a data holder wishes to share data with a receiver, without revealing certain summary statistics of the data distribution (e.g., mean, standard deviation). It achieves this by passing the data through a randomization mechanism. We propose summary statistic privacy, a metric for quantifying the privacy risk of such a mechanism based on the worst-case probability of an adversary guessing the distributional secret within some threshold. Defining distortion as a worst-case Wasserstein-1 distance between the real and released data, we prove lower bounds on the tradeoff between privacy and distortion. We then propose a class of quantization mechanisms that can be adapted to different data distributions. We show that the quantization mechanism’s privacy-distortion tradeoff matches our lower bounds under certain regimes, up to small constant factors. Finally, we demonstrate on real-world datasets that the proposed quantization mechanisms achieve better privacy-distortion tradeoffs than alternative privacy mechanisms.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"5 ","pages":"369-384"},"PeriodicalIF":0.0,"publicationDate":"2024-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141422569","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Straggler-Resilient Differentially Private Decentralized Learning","authors":"Yauhen Yakimenka;Chung-Wei Weng;Hsuan-Yin Lin;Eirik Rosnes;Jörg Kliewer","doi":"10.1109/JSAIT.2024.3400995","DOIUrl":"https://doi.org/10.1109/JSAIT.2024.3400995","url":null,"abstract":"We consider the straggler problem in decentralized learning over a logical ring while preserving user data privacy. Especially, we extend the recently proposed framework of differential privacy (DP) amplification by decentralization by Cyffers and Bellet to include overall training latency—comprising both computation and communication latency. Analytical results on both the convergence speed and the DP level are derived for both a skipping scheme (which ignores the stragglers after a timeout) and a baseline scheme that waits for each node to finish before the training continues. A trade-off between overall training latency, accuracy, and privacy, parameterized by the timeout of the skipping scheme, is identified and empirically validated for logistic regression on a real-world dataset and for image classification using the MNIST and CIFAR-10 datasets.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"5 ","pages":"407-423"},"PeriodicalIF":0.0,"publicationDate":"2024-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141495176","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Robust Algorithmic Recourse Under Model Multiplicity With Probabilistic Guarantees","authors":"Faisal Hamman;Erfaun Noorani;Saumitra Mishra;Daniele Magazzeni;Sanghamitra Dutta","doi":"10.1109/JSAIT.2024.3401407","DOIUrl":"https://doi.org/10.1109/JSAIT.2024.3401407","url":null,"abstract":"There is an emerging interest in generating robust algorithmic recourse that would remain valid if the model is updated or changed even slightly. Towards finding robust algorithmic recourse (or counterfactual explanations), existing literature often assumes that the original model \u0000<italic>m</i>\u0000 and the new model \u0000<italic>M</i>\u0000 are bounded in the parameter space, i.e., \u0000<inline-formula> <tex-math>$|text {Params}(M){-}text {Params}(m)|{lt }Delta $ </tex-math></inline-formula>\u0000. However, models can often change significantly in the parameter space with little to no change in their predictions or accuracy on the given dataset. In this work, we introduce a mathematical abstraction termed \u0000<italic>naturally-occurring</i>\u0000 model change, which allows for arbitrary changes in the parameter space such that the change in predictions on points that lie on the data manifold is limited. Next, we propose a measure – that we call \u0000<italic>Stability</i>\u0000 – to quantify the robustness of counterfactuals to potential model changes for differentiable models, e.g., neural networks. Our main contribution is to show that counterfactuals with sufficiently high value of \u0000<italic>Stability</i>\u0000 as defined by our measure will remain valid after potential “naturally-occurring” model changes with high probability (leveraging concentration bounds for Lipschitz function of independent Gaussians). Since our quantification depends on the local Lipschitz constant around a data point which is not always available, we also examine estimators of our proposed measure and derive a fundamental lower bound on the sample size required to have a precise estimate. We explore methods of using stability measures to generate robust counterfactuals that are close, realistic, and remain valid after potential model changes. This work also has interesting connections with model multiplicity, also known as the Rashomon effect.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"5 ","pages":"357-368"},"PeriodicalIF":0.0,"publicationDate":"2024-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141435351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Contraction of Locally Differentially Private Mechanisms","authors":"Shahab Asoodeh;Huanyu Zhang","doi":"10.1109/JSAIT.2024.3397305","DOIUrl":"https://doi.org/10.1109/JSAIT.2024.3397305","url":null,"abstract":"We investigate the contraction properties of locally differentially private mechanisms. More specifically, we derive tight upper bounds on the divergence between \u0000<inline-formula> <tex-math>$P{mathsf K}$ </tex-math></inline-formula>\u0000 and \u0000<inline-formula> <tex-math>$Q{mathsf K}$ </tex-math></inline-formula>\u0000 output distributions of an \u0000<inline-formula> <tex-math>$varepsilon $ </tex-math></inline-formula>\u0000-LDP mechanism \u0000<inline-formula> <tex-math>$mathsf K$ </tex-math></inline-formula>\u0000 in terms of a divergence between the corresponding input distributions P and Q, respectively. Our first main technical result presents a sharp upper bound on the \u0000<inline-formula> <tex-math>$chi ^{2}$ </tex-math></inline-formula>\u0000-divergence \u0000<inline-formula> <tex-math>$chi ^{2}(P{mathsf K}|Q{mathsf K})$ </tex-math></inline-formula>\u0000 in terms of \u0000<inline-formula> <tex-math>$chi ^{2}(P|Q)$ </tex-math></inline-formula>\u0000 and \u0000<inline-formula> <tex-math>$varepsilon $ </tex-math></inline-formula>\u0000. We also show that the same result holds for a large family of divergences, including KL-divergence and squared Hellinger distance. The second main technical result gives an upper bound on \u0000<inline-formula> <tex-math>$chi ^{2}(P{mathsf K}|Q{mathsf K})$ </tex-math></inline-formula>\u0000 in terms of total variation distance \u0000<inline-formula> <tex-math>${textsf {TV}}(P, Q)$ </tex-math></inline-formula>\u0000 and \u0000<inline-formula> <tex-math>$varepsilon $ </tex-math></inline-formula>\u0000. We then utilize these bounds to establish locally private versions of the van Trees inequality, Le Cam’s, Assouad’s, and the mutual information methods —powerful tools for bounding minimax estimation risks. These results are shown to lead to tighter privacy analyses than the state-of-the-arts in several statistical problems such as entropy and discrete distribution estimation, non-parametric density estimation, and hypothesis testing.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"5 ","pages":"385-395"},"PeriodicalIF":0.0,"publicationDate":"2024-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141494817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lucas Monteiro Paes;Ananda Theertha Suresh;Alex Beutel;Flavio P. Calmon;Ahmad Beirami
{"title":"Multi-Group Fairness Evaluation via Conditional Value-at-Risk Testing","authors":"Lucas Monteiro Paes;Ananda Theertha Suresh;Alex Beutel;Flavio P. Calmon;Ahmad Beirami","doi":"10.1109/JSAIT.2024.3397741","DOIUrl":"https://doi.org/10.1109/JSAIT.2024.3397741","url":null,"abstract":"Machine learning (ML) models used in prediction and classification tasks may display performance disparities across population groups determined by sensitive attributes (e.g., race, sex, age). We consider the problem of evaluating the performance of a fixed ML model across population groups defined by multiple sensitive attributes (e.g., race and sex and age). Here, the sample complexity for estimating the worst-case performance gap across groups (e.g., the largest difference in error rates) increases exponentially with the number of group-denoting sensitive attributes. To address this issue, we propose an approach to test for performance disparities based on Conditional Value-at-Risk (CVaR). By allowing a small probabilistic slack on the groups over which a model has approximately equal performance, we show that the sample complexity required for discovering performance violations is reduced exponentially to be at most upper bounded by the square root of the number of groups. As a byproduct of our analysis, when the groups are weighted by a specific prior distribution, we show that Rényi entropy of order 2/3 of the prior distribution captures the sample complexity of the proposed CVaR test algorithm. Finally, we also show that there exists a non-i.i.d. data collection strategy that results in a sample complexity independent of the number of groups.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"5 ","pages":"659-674"},"PeriodicalIF":0.0,"publicationDate":"2024-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142736343","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mark Beliaev;Payam Delgosha;Hamed Hassani;Ramtin Pedarsani
{"title":"Efficient and Robust Classification for Sparse Attacks","authors":"Mark Beliaev;Payam Delgosha;Hamed Hassani;Ramtin Pedarsani","doi":"10.1109/JSAIT.2024.3397187","DOIUrl":"https://doi.org/10.1109/JSAIT.2024.3397187","url":null,"abstract":"Over the past two decades, the rise in adoption of neural networks has surged in parallel with their performance. Concurrently, we have observed the inherent fragility of these prediction models: small changes to the inputs can induce classification errors across entire datasets. In the following study, we examine perturbations constrained by the \u0000<inline-formula> <tex-math>$ell _{0}$ </tex-math></inline-formula>\u0000–norm, a potent attack model in the domains of computer vision, malware detection, and natural language processing. To combat this adversary, we introduce a novel defense technique comprised of two components: “truncation” and “adversarial training”. Subsequently, we conduct a theoretical analysis of the Gaussian mixture setting and establish the asymptotic optimality of our proposed defense. Based on this obtained insight, we broaden the application of our technique to neural networks. Lastly, we empirically validate our results in the domain of computer vision, demonstrating substantial enhancements in the robust classification error of neural networks.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"5 ","pages":"261-272"},"PeriodicalIF":0.0,"publicationDate":"2024-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141096203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Robust Causal Bandits for Linear Models","authors":"Zirui Yan;Arpan Mukherjee;Burak Varıcı;Ali Tajer","doi":"10.1109/JSAIT.2024.3373595","DOIUrl":"https://doi.org/10.1109/JSAIT.2024.3373595","url":null,"abstract":"The sequential design of experiments for optimizing a reward function in causal systems can be effectively modeled by the sequential design of interventions in causal bandits (CBs). In the existing literature on CBs, a critical assumption is that the causal models remain constant over time. However, this assumption does not necessarily hold in complex systems, which constantly undergo temporal model fluctuations. This paper addresses the robustness of CBs to such model fluctuations. The focus is on causal systems with linear structural equation models (SEMs). The SEMs and the time-varying pre- and post-interventional statistical models are all unknown. Cumulative regret is adopted as the design criteria, based on which the objective is to design a sequence of interventions that incur the smallest cumulative regret with respect to an oracle aware of the entire causal model and its fluctuations. First, it is established that the existing approaches fail to maintain regret sub-linearity with even a few instances of model deviation. Specifically, when the number of instances with model deviation is as few as \u0000<inline-formula> <tex-math>$T^{frac {1}{2L}}$ </tex-math></inline-formula>\u0000, where \u0000<inline-formula> <tex-math>$T$ </tex-math></inline-formula>\u0000 is the time horizon and \u0000<inline-formula> <tex-math>$L$ </tex-math></inline-formula>\u0000 is the length of the longest causal path in the graph, the existing algorithms will have linear regret in \u0000<inline-formula> <tex-math>$T$ </tex-math></inline-formula>\u0000. For instance, when \u0000<inline-formula> <tex-math>$T=10^{5}$ </tex-math></inline-formula>\u0000 and \u0000<inline-formula> <tex-math>$L=3$ </tex-math></inline-formula>\u0000, model deviations in 6 out of 105 instances result in a linear regret. Next, a robust CB algorithm is designed, and its regret is analyzed, where upper and information-theoretic lower bounds on the regret are established. Specifically, in a graph with \u0000<inline-formula> <tex-math>$N$ </tex-math></inline-formula>\u0000 nodes and maximum degree \u0000<inline-formula> <tex-math>$d$ </tex-math></inline-formula>\u0000, under a general measure of model deviation \u0000<inline-formula> <tex-math>$C$ </tex-math></inline-formula>\u0000, the cumulative regret is upper bounded by \u0000<inline-formula> <tex-math>$tilde {mathcal {O}}left({d^{L-{}frac {1}{2}}(sqrt {NT} + NC)}right)$ </tex-math></inline-formula>\u0000 and lower bounded by \u0000<inline-formula> <tex-math>$Omega left({d^{frac {L}{2}-2}max {sqrt {T};, ; d^{2}C}}right)$ </tex-math></inline-formula>\u0000. Comparing these bounds establishes that the proposed algorithm achieves nearly optimal \u0000<inline-formula> <tex-math>$tilde{mathcal {O}} (sqrt {T})$ </tex-math></inline-formula>\u0000 regret when \u0000<inline-formula> <tex-math>$C$ </tex-math></inline-formula>\u0000 is \u0000<inline-formula> <tex-math>$o(sqrt {T})$ </tex-math></inline-formula>\u0000 and maintains sub-linear regret for a broader range of \u0000<inline-formula> <tex-math>$C$ </tex-math></inline-formula>\u0000.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"5 ","pages":"78-93"},"PeriodicalIF":0.0,"publicationDate":"2024-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140605970","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}