{"title":"Testing Dependency of Unlabeled Databases","authors":"Vered Paslev;Wasim Huleihel","doi":"10.1109/TIT.2024.3442977","DOIUrl":"10.1109/TIT.2024.3442977","url":null,"abstract":"In this paper, we investigate the problem of deciding whether two random databases \u0000<inline-formula> <tex-math>$textsf {X}in {mathcal { X}} ^{ntimes d}$ </tex-math></inline-formula>\u0000 and \u0000<inline-formula> <tex-math>$textsf {Y}in {mathcal { Y}} ^{ntimes d}$ </tex-math></inline-formula>\u0000 are statistically dependent or not. This is formulated as a hypothesis testing problem, where under the null hypothesis, these two databases are statistically independent, while under the alternative, there exists an unknown row permutation \u0000<inline-formula> <tex-math>$sigma $ </tex-math></inline-formula>\u0000, such that \u0000<inline-formula> <tex-math>$textsf {X}$ </tex-math></inline-formula>\u0000 and \u0000<inline-formula> <tex-math>$textsf {Y}^{sigma } $ </tex-math></inline-formula>\u0000, a permuted version of \u0000<inline-formula> <tex-math>$textsf {Y}$ </tex-math></inline-formula>\u0000, are statistically dependent with some known joint distribution, but have the same marginal distributions as the null. We characterize the thresholds at which optimal testing is information-theoretically impossible and possible, as a function of n, d, and some spectral properties of the generative distributions of the datasets. For example, we prove that if a certain function of the eigenvalues of the likelihood function and d, is below a certain threshold, as \u0000<inline-formula> <tex-math>$dto infty $ </tex-math></inline-formula>\u0000, then weak detection (performing slightly better than random guessing) is statistically impossible, no matter what the value of n is. This mimics the performance of an efficient test that thresholds a centered version of the log-likelihood function of the observed matrices. We also analyze the case where d is fixed, for which we derive strong (vanishing error) and weak detection lower and upper bounds.","PeriodicalId":13494,"journal":{"name":"IEEE Transactions on Information Theory","volume":"70 10","pages":"7410-7431"},"PeriodicalIF":2.2,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142194934","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hyejin Park;Seiyun Shin;Kwang-Sung Jun;Jungseul Ok
{"title":"Transfer Learning in Bandits With Latent Continuity","authors":"Hyejin Park;Seiyun Shin;Kwang-Sung Jun;Jungseul Ok","doi":"10.1109/TIT.2024.3441669","DOIUrl":"10.1109/TIT.2024.3441669","url":null,"abstract":"A continuity structure of correlations among arms in multi-armed bandit can bring a significant acceleration of exploration and reduction of regret, in particular, when there are many arms. However, it is often latent in practice. To cope with the latent continuity, we consider a transfer learning setting where an agent learns the structural information, parameterized by a Lipschitz constant and an embedding of arms, from a sequence of past tasks and transfers it to a new one. We propose a simple but provably-efficient algorithm to accurately estimate and fully exploit the Lipschitz continuity at the same asymptotic order of lower bound of sample complexity in the previous tasks. The proposed algorithm is applicable to estimate not only a latent Lipschitz constant given an embedding, but also a latent embedding, while the latter requires slightly more sample complexity. To be specific, we analyze the efficiency of the proposed framework in two folds: (i) our regret bound on the new task is close to that of the oracle algorithm with the full knowledge of the Lipschitz continuity under mild assumptions; and (ii) the sample complexity of our estimator matches with the information-theoretic fundamental limit. Our analysis reveals a set of useful insights on transfer learning for latent Lipschitz continuity. From a numerical evaluation based on real-world dataset of rate adaptation in time-varying wireless channel, we demonstrate the theoretical findings and show the superiority of the proposed framework compared to baselines.","PeriodicalId":13494,"journal":{"name":"IEEE Transactions on Information Theory","volume":"70 11","pages":"7952-7970"},"PeriodicalIF":2.2,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142194904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Scalable Multi-Round Multi-Party Privacy-Preserving Neural Network Training","authors":"Xingyu Lu;Umit Yigit Basaran;Başak Güler","doi":"10.1109/TIT.2024.3441509","DOIUrl":"10.1109/TIT.2024.3441509","url":null,"abstract":"Privacy-preserving machine learning has achieved breakthrough advances in collaborative training of machine learning models, under strong information-theoretic privacy guarantees. Despite the recent advances, communication bottleneck still remains as a major challenge against scalability in neural networks. To address this challenge, this paper presents the first scalable multi-party neural network training framework with linear communication complexity, significantly improving over the quadratic state-of-the-art, under strong end-to-end information-theoretic privacy guarantees. Our contribution is an iterative coded computing mechanism with linear communication complexity, termed Double Lagrange Coding, which allows iterative scalable multi-party polynomial computations without degrading the parallelization gain, adversary tolerance, and dropout resilience throughout the iterations. While providing strong multi-round information-theoretic privacy guarantees, our framework achieves equal adversary tolerance, resilience to user dropouts, and model accuracy to the state-of-the-art, while reducing the communication overhead from quadratic to linear. In doing so, our framework addresses a key technical challenge in collaborative privacy-preserving machine learning, while paving the way for large-scale privacy-preserving iterative algorithms for deep learning and beyond.","PeriodicalId":13494,"journal":{"name":"IEEE Transactions on Information Theory","volume":"70 11","pages":"8204-8236"},"PeriodicalIF":2.2,"publicationDate":"2024-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141947528","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Conflict-Avoiding Codes of Prime Lengths and Cyclotomic Numbers","authors":"Liang-Chung Hsia;Hua-Chieh Li;Wei-Liang Sun","doi":"10.1109/TIT.2024.3439714","DOIUrl":"10.1109/TIT.2024.3439714","url":null,"abstract":"The problem to construct optimal conflict-avoiding codes of even lengths and the Hamming weight 3 is completely settled. On the contrary, it is still open for odd lengths. It turns out that the prime lengths are the fundamental cases needed to be constructed. In the article, we study conflict-avoiding codes of prime lengths and give a connection with the so-called cyclotomic numbers. By having some nonzero cyclotomic numbers, a well-known algorithm for constructing optimal conflict-avoiding codes will work for certain prime lengths. As a consequence, we are able to answer the size of optimal conflict-avoiding code for a new class of prime lengths.","PeriodicalId":13494,"journal":{"name":"IEEE Transactions on Information Theory","volume":"70 10","pages":"6834-6841"},"PeriodicalIF":2.2,"publicationDate":"2024-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141947526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Corrections to “Private Information Retrieval Over Gaussian MAC”","authors":"Or Elimelech;Ori Shmuel;Asaf Cohen","doi":"10.1109/TIT.2024.3440476","DOIUrl":"10.1109/TIT.2024.3440476","url":null,"abstract":"In the above article \u0000<xref>[1]</xref>\u0000, the authors introduced a PIR scheme for the Additive White Gaussian Noise (AWGN) Multiple Access Channel (MAC), both with and without fading. The authors utilized the additive nature of the channel and leveraged the linear properties and structure of lattice codes to retrieve the desired message without the servers acquiring any knowledge about the retrieved message’s index. Theorems 3 and 4 in \u0000<xref>[1]</xref>\u0000 contain an error arising from the incorrect usage of the modulo operator. Moreover, the proofs assume a one-to-one mapping function, \u0000<inline-formula> <tex-math>$phi (cdot)$ </tex-math></inline-formula>\u0000, between a message \u0000<inline-formula> <tex-math>$W_{j}in mathbb {F}_{p}^{L}$ </tex-math></inline-formula>\u0000 and the elements of \u0000<inline-formula> <tex-math>$mathcal { C}$ </tex-math></inline-formula>\u0000, mistakenly suggesting that the user possesses all the required information in advance. To deal with that, we defined \u0000<inline-formula> <tex-math>$phi (cdot)$ </tex-math></inline-formula>\u0000 as a one-to-one mapping function between a vector of \u0000<italic>l</i>\u0000 information bits and a lattice point \u0000<inline-formula> <tex-math>$lambda in {mathcal { C}}$ </tex-math></inline-formula>\u0000. Herein, we present the corrected versions of these theorems.","PeriodicalId":13494,"journal":{"name":"IEEE Transactions on Information Theory","volume":"70 10","pages":"7521-7524"},"PeriodicalIF":2.2,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141947529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Missing g-Mass: Investigating the Missing Parts of Distributions","authors":"Prafulla Chandra;Andrew Thangaraj","doi":"10.1109/TIT.2024.3440661","DOIUrl":"10.1109/TIT.2024.3440661","url":null,"abstract":"Estimating the underlying distribution from iid samples is a classical and important problem in statistics. When the alphabet size is large compared to number of samples, a portion of the distribution is highly likely to be unobserved or sparsely observed. The missing mass, defined as the sum of probabilities \u0000<inline-formula> <tex-math>$Pr (x)$ </tex-math></inline-formula>\u0000 over the missing letters x, and the Good-Turing estimator for missing mass have been important tools in large-alphabet distribution estimation. In this article, given a positive function g from \u0000<inline-formula> <tex-math>$[{0,1}]$ </tex-math></inline-formula>\u0000 to the reals, the missing g-mass, defined as the sum of \u0000<inline-formula> <tex-math>$g(Pr (x))$ </tex-math></inline-formula>\u0000 over the missing letters x, is introduced and studied. The missing g-mass can be used to investigate the structure of the missing part of the distribution. Specific applications for special cases such as order-\u0000<inline-formula> <tex-math>$alpha $ </tex-math></inline-formula>\u0000 missing mass (\u0000<inline-formula> <tex-math>$g(p)=p^{alpha }$ </tex-math></inline-formula>\u0000) and the missing Shannon entropy (\u0000<inline-formula> <tex-math>$g(p)=-plog p$ </tex-math></inline-formula>\u0000) include estimating distance from uniformity of the missing distribution and its partial estimation. Minimax estimation is studied for order-\u0000<inline-formula> <tex-math>$alpha $ </tex-math></inline-formula>\u0000 missing mass for integer values of \u0000<inline-formula> <tex-math>$alpha $ </tex-math></inline-formula>\u0000 and exact minimax convergence rates are obtained. Concentration is studied for a class of functions g and specific results are derived for order-\u0000<inline-formula> <tex-math>$alpha $ </tex-math></inline-formula>\u0000 missing mass and missing Shannon entropy. Sub-Gaussian tail bounds with near-optimal worst-case variance factors are derived. Two new notions of concentration, named strongly sub-Gamma and filtered sub-Gaussian concentration, are introduced and shown to result in right tail bounds that are better than those obtained from sub-Gaussian concentration.","PeriodicalId":13494,"journal":{"name":"IEEE Transactions on Information Theory","volume":"70 10","pages":"7049-7065"},"PeriodicalIF":2.2,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141947527","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Worst-Case Misidentification Control in Sequential Change Diagnosis Using the Min-CuSum","authors":"Austin Warner;Georgios Fellouris","doi":"10.1109/TIT.2024.3437158","DOIUrl":"10.1109/TIT.2024.3437158","url":null,"abstract":"The problem of sequential change diagnosis is considered, where a sequence of independent random elements is accessed sequentially, there is an abrupt change in its distribution at some unknown time, and there are two main operational goals: to quickly detect the change, and to accurately identify upon stopping the post-change distribution among a finite set of alternatives. The focus is on the min-CuSum algorithm, which raises an alarm as soon as a CuSum statistic that corresponds to one of the post-change alternatives exceeds a certain threshold. We obtain, under certain assumptions, non-asymptotic upper bounds on its conditional probability of misidentification given that a false alarm did not occur. When, in particular, the data are generated over independent channels and the change can occur in only one of them, its worst-case—with respect to the change point—conditional probability of misidentification given that there was not a false alarm is shown to decay exponentially fast in the threshold. As a corollary, in this setup, the min-CuSum is shown to asymptotically minimize Lorden’s detection delay criterion, simultaneously for every post-change scenario, within the class of schemes that satisfy prescribed bounds on both the false alarm rate and the worst-case conditional probability of misidentification, in a regime where the latter does not go to zero faster than the former. Finally, these theoretical results are also illustrated in simulation studies.","PeriodicalId":13494,"journal":{"name":"IEEE Transactions on Information Theory","volume":"70 11","pages":"8364-8377"},"PeriodicalIF":2.2,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10632080","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141969540","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multiple-Error-Correcting Codes for Analog Computing on Resistive Crossbars","authors":"Hengjia Wei, Ron M. Roth","doi":"10.1109/tit.2024.3439674","DOIUrl":"https://doi.org/10.1109/tit.2024.3439674","url":null,"abstract":"","PeriodicalId":13494,"journal":{"name":"IEEE Transactions on Information Theory","volume":"6 1","pages":""},"PeriodicalIF":2.5,"publicationDate":"2024-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141947530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Further Study of Vectorial Dual-Bent Functions","authors":"Jiaxin Wang;Fang-Wei Fu;Yadi Wei;Jing Yang","doi":"10.1109/TIT.2024.3439375","DOIUrl":"10.1109/TIT.2024.3439375","url":null,"abstract":"Vectorial dual-bent functions have recently attracted some researchers’ interest as they play a significant role in constructing partial difference sets, association schemes, bent partitions, and linear codes. In this paper, we further study vectorial dual-bent functions \u0000<inline-formula> <tex-math>$F: V_{n}^{(p)}rightarrow V_{m}^{(p)}$ </tex-math></inline-formula>\u0000, where \u0000<inline-formula> <tex-math>$2leq m leq frac {n}{2}$ </tex-math></inline-formula>\u0000, and \u0000<inline-formula> <tex-math>$V_{n}^{(p)}$ </tex-math></inline-formula>\u0000 denotes an n-dimensional vector space over the prime field \u0000<inline-formula> <tex-math>$mathbb {F}_{p}$ </tex-math></inline-formula>\u0000. For certain vectorial dual-bent functions (called vectorial dual-bent functions with Condition A), we present a more concise characterization in terms of partial difference sets than the one given in Wang et al. (2023), and give new characterizations in terms of amorphic association schemes, linear codes, and generalized Hadamard matrices, respectively. When \u0000<inline-formula> <tex-math>$p=2$ </tex-math></inline-formula>\u0000, we characterize vectorial dual-bent functions with Condition A in terms of bent partitions. Through the relationship between vectorial dual-bent functions and bent partitions, new characterizations of certain bent partitions in terms of amorphic association schemes, linear codes, and generalized Hadamard matrices are obtained. For a vectorial dual-bent function \u0000<inline-formula> <tex-math>$F: V_{n}^{(p)}rightarrow V_{m}^{(p)}$ </tex-math></inline-formula>\u0000 with \u0000<inline-formula> <tex-math>$F(0)=0, F(x)=F(-x)$ </tex-math></inline-formula>\u0000, where \u0000<inline-formula> <tex-math>$2leq m leq frac {n}{2}$ </tex-math></inline-formula>\u0000, we give a necessary and sufficient condition under which the preimage set partition of F induces an association scheme. By using two classes of vectorial dual-bent functions, more association schemes are obtained.","PeriodicalId":13494,"journal":{"name":"IEEE Transactions on Information Theory","volume":"70 10","pages":"7472-7483"},"PeriodicalIF":2.2,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141947532","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}