IEEE journal on selected areas in information theory最新文献_第6页

Exactly Optimal and Communication-Efficient Private Estimation via Block Designs 通过分块设计实现精确最优和通信高效的私有估计

IEEE journal on selected areas in information theory Pub Date : 2024-03-27 DOI: 10.1109/JSAIT.2024.3381195

Hyun-Young Park;Seung-Hyun Nam;Si-Hyeon Lee

{"title":"Exactly Optimal and Communication-Efficient Private Estimation via Block Designs","authors":"Hyun-Young Park;Seung-Hyun Nam;Si-Hyeon Lee","doi":"10.1109/JSAIT.2024.3381195","DOIUrl":"https://doi.org/10.1109/JSAIT.2024.3381195","url":null,"abstract":"In this paper, we propose a new class of local differential privacy (LDP) schemes based on combinatorial block designs for discrete distribution estimation. This class not only recovers many known LDP schemes in a unified framework of combinatorial block design, but also suggests a novel way of finding new schemes achieving the exactly optimal (or near-optimal) privacy-utility trade-off with lower communication costs. Indeed, we find many new LDP schemes that achieve the exactly optimal privacy-utility trade-off, with the minimum communication cost among all the unbiased or consistent schemes, for a certain set of input data size and LDP constraint. Furthermore, to partially solve the sparse existence issue of block design schemes, we consider a broader class of LDP schemes based on regular and pairwise-balanced designs, called RPBD schemes, which relax one of the symmetry requirements on block designs. By considering this broader class of RPBD schemes, we can find LDP schemes achieving near-optimal privacy-utility trade-off with reasonably low communication costs for a much larger set of input data size and LDP constraint.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"5 ","pages":"123-134"},"PeriodicalIF":0.0,"publicationDate":"2024-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140621268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Learning Robust to Distributional Uncertainties and Adversarial Data 适应分布不确定性和对抗性数据的鲁棒学习

IEEE journal on selected areas in information theory Pub Date : 2024-03-26 DOI: 10.1109/JSAIT.2024.3381869

Alireza Sadeghi;Gang Wang;Georgios B. Giannakis

{"title":"Learning Robust to Distributional Uncertainties and Adversarial Data","authors":"Alireza Sadeghi;Gang Wang;Georgios B. Giannakis","doi":"10.1109/JSAIT.2024.3381869","DOIUrl":"https://doi.org/10.1109/JSAIT.2024.3381869","url":null,"abstract":"Successful training of data-intensive deep neural networks critically rely on vast, clean, and high-quality datasets. In practice however, their reliability diminishes, particularly with noisy, outlier-corrupted data samples encountered in testing. This challenge intensifies when dealing with anonymized, heterogeneous data sets stored across geographically distinct locations due to, e.g., privacy concerns. This present paper introduces robust learning frameworks tailored for centralized and federated learning scenarios. Our goal is to fortify model resilience with a focus that lies in (i) addressing distribution shifts from training to inference time; and, (ii) ensuring test-time robustness, when a trained model may encounter outliers or adversarially contaminated test data samples. To this aim, we start with a centralized setting where the true data distribution is considered unknown, but residing within a Wasserstein ball centered at the empirical distribution. We obtain robust models by minimizing the worst-case expected loss within this ball, yielding an intractable infinite-dimensional optimization problem. Upon leverage the strong duality condition, we arrive at a tractable surrogate learning problem. We develop two stochastic primal-dual algorithms to solve the resultant problem: one for \u0000<inline-formula> <tex-math>$epsilon $ </tex-math></inline-formula>\u0000-accurate convex sub-problems and another for a single gradient ascent step. We further develop a distributionally robust federated learning framework to learn robust model using heterogeneous data sets stored at distinct locations by solving per-learner’s sub-problems locally, offering robustness with modest computational overhead and considering data distribution. Numerical tests corroborate merits of our training algorithms against distributional uncertainties and adversarially corrupted test data samples.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"5 ","pages":"105-122"},"PeriodicalIF":0.0,"publicationDate":"2024-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140619580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Exactly Tight Information-Theoretic Generalization Error Bound for the Quadratic Gaussian Problem 二次高斯问题的精确严密信息论广义误差约束

IEEE journal on selected areas in information theory Pub Date : 2024-03-22 DOI: 10.1109/JSAIT.2024.3380598

Ruida Zhou;Chao Tian;Tie Liu

{"title":"Exactly Tight Information-Theoretic Generalization Error Bound for the Quadratic Gaussian Problem","authors":"Ruida Zhou;Chao Tian;Tie Liu","doi":"10.1109/JSAIT.2024.3380598","DOIUrl":"https://doi.org/10.1109/JSAIT.2024.3380598","url":null,"abstract":"We provide a new information-theoretic generalization error bound that is exactly tight (i.e., matching even the constant) for the canonical quadratic Gaussian (location) problem. Most existing bounds are order-wise loose in this setting, which has raised concerns about the fundamental capability of information-theoretic bounds in reasoning the generalization behavior for machine learning. The proposed new bound adopts the individual-sample-based approach proposed by Bu et al., but also has several key new ingredients. Firstly, instead of applying the change of measure inequality on the loss function, we apply it to the generalization error function itself; secondly, the bound is derived in a conditional manner; lastly, a reference distribution is introduced. The combination of these components produces a KL-divergence-based generalization error bound. We show that although the latter two new ingredients can help make the bound exactly tight, removing them does not significantly degrade the bound, leading to an asymptotically tight mutual-information-based bound. We further consider the vector Gaussian setting, where a direct application of the proposed bound again does not lead to tight bounds except in special cases. A refined bound is then proposed by a decomposition of loss functions, leading to a tight bound for the vector setting.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"5 ","pages":"94-104"},"PeriodicalIF":0.0,"publicationDate":"2024-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140606029","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Summary Statistic Privacy in Data Sharing 数据共享中的隐私问题统计摘要

IEEE journal on selected areas in information theory Pub Date : 2024-03-21 DOI: 10.1109/JSAIT.2024.3403811

Zinan Lin;Shuaiqi Wang;Vyas Sekar;Giulia Fanti

引用次数: 0

Straggler-Resilient Differentially Private Decentralized Learning 徘徊者-弹性差异化私有分散学习

IEEE journal on selected areas in information theory Pub Date : 2024-03-20 DOI: 10.1109/JSAIT.2024.3400995

Yauhen Yakimenka;Chung-Wei Weng;Hsuan-Yin Lin;Eirik Rosnes;Jörg Kliewer

引用次数: 0

Robust Algorithmic Recourse Under Model Multiplicity With Probabilistic Guarantees 模型多重性下的稳健算法求助与概率保证

IEEE journal on selected areas in information theory Pub Date : 2024-03-15 DOI: 10.1109/JSAIT.2024.3401407

Faisal Hamman;Erfaun Noorani;Saumitra Mishra;Daniele Magazzeni;Sanghamitra Dutta

{"title":"Robust Algorithmic Recourse Under Model Multiplicity With Probabilistic Guarantees","authors":"Faisal Hamman;Erfaun Noorani;Saumitra Mishra;Daniele Magazzeni;Sanghamitra Dutta","doi":"10.1109/JSAIT.2024.3401407","DOIUrl":"https://doi.org/10.1109/JSAIT.2024.3401407","url":null,"abstract":"There is an emerging interest in generating robust algorithmic recourse that would remain valid if the model is updated or changed even slightly. Towards finding robust algorithmic recourse (or counterfactual explanations), existing literature often assumes that the original model \u0000<italic>m\u0000 and the new model \u0000<italic>M\u0000 are bounded in the parameter space, i.e., \u0000<inline-formula> <tex-math>$|text {Params}(M){-}text {Params}(m)|{lt }Delta $ </tex-math></inline-formula>\u0000. However, models can often change significantly in the parameter space with little to no change in their predictions or accuracy on the given dataset. In this work, we introduce a mathematical abstraction termed \u0000<italic>naturally-occurring\u0000 model change, which allows for arbitrary changes in the parameter space such that the change in predictions on points that lie on the data manifold is limited. Next, we propose a measure – that we call \u0000<italic>Stability\u0000 – to quantify the robustness of counterfactuals to potential model changes for differentiable models, e.g., neural networks. Our main contribution is to show that counterfactuals with sufficiently high value of \u0000<italic>Stability\u0000 as defined by our measure will remain valid after potential “naturally-occurring” model changes with high probability (leveraging concentration bounds for Lipschitz function of independent Gaussians). Since our quantification depends on the local Lipschitz constant around a data point which is not always available, we also examine estimators of our proposed measure and derive a fundamental lower bound on the sample size required to have a precise estimate. We explore methods of using stability measures to generate robust counterfactuals that are close, realistic, and remain valid after potential model changes. This work also has interesting connections with model multiplicity, also known as the Rashomon effect.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"5 ","pages":"357-368"},"PeriodicalIF":0.0,"publicationDate":"2024-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141435351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Contraction of Locally Differentially Private Mechanisms 局部不同私有机制的收缩

IEEE journal on selected areas in information theory Pub Date : 2024-03-09 DOI: 10.1109/JSAIT.2024.3397305

Shahab Asoodeh;Huanyu Zhang

{"title":"Contraction of Locally Differentially Private Mechanisms","authors":"Shahab Asoodeh;Huanyu Zhang","doi":"10.1109/JSAIT.2024.3397305","DOIUrl":"https://doi.org/10.1109/JSAIT.2024.3397305","url":null,"abstract":"We investigate the contraction properties of locally differentially private mechanisms. More specifically, we derive tight upper bounds on the divergence between \u0000<inline-formula> <tex-math>$P{mathsf K}$ </tex-math></inline-formula>\u0000 and \u0000<inline-formula> <tex-math>$Q{mathsf K}$ </tex-math></inline-formula>\u0000 output distributions of an \u0000<inline-formula> <tex-math>$varepsilon $ </tex-math></inline-formula>\u0000-LDP mechanism \u0000<inline-formula> <tex-math>$mathsf K$ </tex-math></inline-formula>\u0000 in terms of a divergence between the corresponding input distributions P and Q, respectively. Our first main technical result presents a sharp upper bound on the \u0000<inline-formula> <tex-math>$chi ^{2}$ </tex-math></inline-formula>\u0000-divergence \u0000<inline-formula> <tex-math>$chi ^{2}(P{mathsf K}|Q{mathsf K})$ </tex-math></inline-formula>\u0000 in terms of \u0000<inline-formula> <tex-math>$chi ^{2}(P|Q)$ </tex-math></inline-formula>\u0000 and \u0000<inline-formula> <tex-math>$varepsilon $ </tex-math></inline-formula>\u0000. We also show that the same result holds for a large family of divergences, including KL-divergence and squared Hellinger distance. The second main technical result gives an upper bound on \u0000<inline-formula> <tex-math>$chi ^{2}(P{mathsf K}|Q{mathsf K})$ </tex-math></inline-formula>\u0000 in terms of total variation distance \u0000<inline-formula> <tex-math>${textsf {TV}}(P, Q)$ </tex-math></inline-formula>\u0000 and \u0000<inline-formula> <tex-math>$varepsilon $ </tex-math></inline-formula>\u0000. We then utilize these bounds to establish locally private versions of the van Trees inequality, Le Cam’s, Assouad’s, and the mutual information methods —powerful tools for bounding minimax estimation risks. These results are shown to lead to tighter privacy analyses than the state-of-the-arts in several statistical problems such as entropy and discrete distribution estimation, non-parametric density estimation, and hypothesis testing.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"5 ","pages":"385-395"},"PeriodicalIF":0.0,"publicationDate":"2024-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141494817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multi-Group Fairness Evaluation via Conditional Value-at-Risk Testing 通过条件风险价值测试进行多组公平性评估

IEEE journal on selected areas in information theory Pub Date : 2024-03-09 DOI: 10.1109/JSAIT.2024.3397741

Lucas Monteiro Paes;Ananda Theertha Suresh;Alex Beutel;Flavio P. Calmon;Ahmad Beirami

{"title":"Multi-Group Fairness Evaluation via Conditional Value-at-Risk Testing","authors":"Lucas Monteiro Paes;Ananda Theertha Suresh;Alex Beutel;Flavio P. Calmon;Ahmad Beirami","doi":"10.1109/JSAIT.2024.3397741","DOIUrl":"https://doi.org/10.1109/JSAIT.2024.3397741","url":null,"abstract":"Machine learning (ML) models used in prediction and classification tasks may display performance disparities across population groups determined by sensitive attributes (e.g., race, sex, age). We consider the problem of evaluating the performance of a fixed ML model across population groups defined by multiple sensitive attributes (e.g., race and sex and age). Here, the sample complexity for estimating the worst-case performance gap across groups (e.g., the largest difference in error rates) increases exponentially with the number of group-denoting sensitive attributes. To address this issue, we propose an approach to test for performance disparities based on Conditional Value-at-Risk (CVaR). By allowing a small probabilistic slack on the groups over which a model has approximately equal performance, we show that the sample complexity required for discovering performance violations is reduced exponentially to be at most upper bounded by the square root of the number of groups. As a byproduct of our analysis, when the groups are weighted by a specific prior distribution, we show that Rényi entropy of order 2/3 of the prior distribution captures the sample complexity of the proposed CVaR test algorithm. Finally, we also show that there exists a non-i.i.d. data collection strategy that results in a sample complexity independent of the number of groups.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"5 ","pages":"659-674"},"PeriodicalIF":0.0,"publicationDate":"2024-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142736343","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Efficient and Robust Classification for Sparse Attacks 针对稀疏攻击的高效稳健分类

IEEE journal on selected areas in information theory Pub Date : 2024-03-06 DOI: 10.1109/JSAIT.2024.3397187

Mark Beliaev;Payam Delgosha;Hamed Hassani;Ramtin Pedarsani

引用次数: 0

Robust Causal Bandits for Linear Models 线性模型的稳健因果匪帮

IEEE journal on selected areas in information theory Pub Date : 2024-03-06 DOI: 10.1109/JSAIT.2024.3373595

Zirui Yan;Arpan Mukherjee;Burak Varıcı;Ali Tajer

{"title":"Robust Causal Bandits for Linear Models","authors":"Zirui Yan;Arpan Mukherjee;Burak Varıcı;Ali Tajer","doi":"10.1109/JSAIT.2024.3373595","DOIUrl":"https://doi.org/10.1109/JSAIT.2024.3373595","url":null,"abstract":"The sequential design of experiments for optimizing a reward function in causal systems can be effectively modeled by the sequential design of interventions in causal bandits (CBs). In the existing literature on CBs, a critical assumption is that the causal models remain constant over time. However, this assumption does not necessarily hold in complex systems, which constantly undergo temporal model fluctuations. This paper addresses the robustness of CBs to such model fluctuations. The focus is on causal systems with linear structural equation models (SEMs). The SEMs and the time-varying pre- and post-interventional statistical models are all unknown. Cumulative regret is adopted as the design criteria, based on which the objective is to design a sequence of interventions that incur the smallest cumulative regret with respect to an oracle aware of the entire causal model and its fluctuations. First, it is established that the existing approaches fail to maintain regret sub-linearity with even a few instances of model deviation. Specifically, when the number of instances with model deviation is as few as \u0000<inline-formula> <tex-math>$T^{frac {1}{2L}}$ </tex-math></inline-formula>\u0000, where \u0000<inline-formula> <tex-math>$T$ </tex-math></inline-formula>\u0000 is the time horizon and \u0000<inline-formula> <tex-math>$L$ </tex-math></inline-formula>\u0000 is the length of the longest causal path in the graph, the existing algorithms will have linear regret in \u0000<inline-formula> <tex-math>$T$ </tex-math></inline-formula>\u0000. For instance, when \u0000<inline-formula> <tex-math>$T=10^{5}$ </tex-math></inline-formula>\u0000 and \u0000<inline-formula> <tex-math>$L=3$ </tex-math></inline-formula>\u0000, model deviations in 6 out of 105 instances result in a linear regret. Next, a robust CB algorithm is designed, and its regret is analyzed, where upper and information-theoretic lower bounds on the regret are established. Specifically, in a graph with \u0000<inline-formula> <tex-math>$N$ </tex-math></inline-formula>\u0000 nodes and maximum degree \u0000<inline-formula> <tex-math>$d$ </tex-math></inline-formula>\u0000, under a general measure of model deviation \u0000<inline-formula> <tex-math>$C$ </tex-math></inline-formula>\u0000, the cumulative regret is upper bounded by \u0000<inline-formula> <tex-math>$tilde {mathcal {O}}left({d^{L-{}frac {1}{2}}(sqrt {NT} + NC)}right)$ </tex-math></inline-formula>\u0000 and lower bounded by \u0000<inline-formula> <tex-math>$Omega left({d^{frac {L}{2}-2}max {sqrt {T};, ; d^{2}C}}right)$ </tex-math></inline-formula>\u0000. Comparing these bounds establishes that the proposed algorithm achieves nearly optimal \u0000<inline-formula> <tex-math>$tilde{mathcal {O}} (sqrt {T})$ </tex-math></inline-formula>\u0000 regret when \u0000<inline-formula> <tex-math>$C$ </tex-math></inline-formula>\u0000 is \u0000<inline-formula> <tex-math>$o(sqrt {T})$ </tex-math></inline-formula>\u0000 and maintains sub-linear regret for a broader range of \u0000<inline-formula> <tex-math>$C$ </tex-math></inline-formula>\u0000.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"5 ","pages":"78-93"},"PeriodicalIF":0.0,"publicationDate":"2024-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140605970","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0