{"title":"Dynamic ranking and translation synchronization","authors":"E. Araya, Eglantine Karl'e, Hemant Tyagi","doi":"10.1093/imaiai/iaad029","DOIUrl":"https://doi.org/10.1093/imaiai/iaad029","url":null,"abstract":"\u0000 In many applications, such as sport tournaments or recommendation systems, we have at our disposal data consisting of pairwise comparisons between a set of $n$ items (or players). The objective is to use these data to infer the latent strength of each item and/or their ranking. Existing results for this problem predominantly focus on the setting consisting of a single comparison graph $G$. However, there exist scenarios (e.g. sports tournaments) where the pairwise comparison data evolve with time. Theoretical results for this dynamic setting are relatively limited, and are the focus of this paper. We study an extension of the translation synchronization problem, to the dynamic setting. In this set-up, we are given a sequence of comparison graphs $(G_t)_{tin{{mathscr{T}}}}$, where $ {{mathscr{T}}} subset [0,1]$ is a grid representing the time domain, and for each item $i$ and time $tin{{mathscr{T}}}$ there is an associated unknown strength parameter $z^*_{t,i}in{{mathbb{R}}}$. We aim to recover, for $tin{{mathscr{T}}}$, the strength vector $z^*_t=(z^*_{t,1},dots ,z^*_{t,n})$ from noisy measurements of $z^*_{t,i}-z^*_{t,j}$, where $left {{i,j}right }$ is an edge in $G_t$. Assuming that $z^*_t$ evolves smoothly in $t$, we propose two estimators—one based on a smoothness-penalized least squares approach and the other based on projection onto the low-frequency eigenspace of a suitable smoothness operator. For both estimators, we provide finite sample bounds for the $ell _2$ estimation error under the assumption that $G_t$ is connected for all $tin{{mathscr{T}}}$, thus proving the consistency of the proposed methods in terms of the grid size $|mathscr{T}|$. We complement our theoretical findings with experiments on synthetic and real data.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2022-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82565710","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tamir Bendory, Ariel Jaffe, William Leeb, Nir Sharon, Amit Singer
{"title":"Super-resolution multi-reference alignment.","authors":"Tamir Bendory, Ariel Jaffe, William Leeb, Nir Sharon, Amit Singer","doi":"10.1093/imaiai/iaab003","DOIUrl":"10.1093/imaiai/iaab003","url":null,"abstract":"<p><p>We study super-resolution multi-reference alignment, the problem of estimating a signal from many circularly shifted, down-sampled and noisy observations. We focus on the low SNR regime, and show that a signal in <math> <mrow><msup><mi>ℝ</mi> <mi>M</mi></msup> </mrow> </math> is uniquely determined when the number <i>L</i> of samples per observation is of the order of the square root of the signal's length ( <math><mrow><mi>L</mi> <mo>=</mo> <mi>O</mi> <mo>(</mo> <msqrt><mi>M</mi></msqrt> <mo>)</mo></mrow> </math> ). Phrased more informally, one can square the resolution. This result holds if the number of observations is proportional to 1/SNR<sup>3</sup>. In contrast, with fewer observations recovery is impossible even when the observations are not down-sampled (<i>L</i> = <i>M</i>). The analysis combines tools from statistical signal processing and invariant theory. We design an expectation-maximization algorithm and demonstrate that it can super-resolve the signal in challenging SNR regimes.</p>","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9374099/pdf/nihms-1776575.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40708781","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Minimax optimal clustering of bipartite graphs with a generalized power method","authors":"Guillaume Braun, Hemant Tyagi","doi":"10.1093/imaiai/iaad006","DOIUrl":"https://doi.org/10.1093/imaiai/iaad006","url":null,"abstract":"\u0000 Clustering bipartite graphs is a fundamental task in network analysis. In the high-dimensional regime where the number of rows $n_{1}$ and the number of columns $n_{2}$ of the associated adjacency matrix are of different order, the existing methods derived from the ones used for symmetric graphs can come with sub-optimal guarantees. Due to increasing number of applications for bipartite graphs in the high-dimensional regime, it is of fundamental importance to design optimal algorithms for this setting. The recent work of Ndaoud et al. (2022, IEEE Trans. Inf. Theory, 68, 1960–1975) improves the existing upper-bound for the misclustering rate in the special case where the columns (resp. rows) can be partitioned into $L = 2$ (resp. $K = 2$) communities. Unfortunately, their algorithm cannot be extended to the more general setting where $K neq L geq 2$. We overcome this limitation by introducing a new algorithm based on the power method. We derive conditions for exact recovery in the general setting where $K neq L geq 2$, and show that it recovers the result in Ndaoud et al. (2022, IEEE Trans. Inf. Theory, 68, 1960–1975). We also derive a minimax lower bound on the misclustering error when $K=L$ under a symmetric version of our model, which matches the corresponding upper bound up to a factor depending on $K$.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2022-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74089679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An analysis of classical multidimensional scaling with applications to clustering.","authors":"Anna Little, Yuying Xie, Qiang Sun","doi":"10.1093/imaiai/iaac004","DOIUrl":"10.1093/imaiai/iaac004","url":null,"abstract":"<p><p>Classical multidimensional scaling is a widely used dimension reduction technique. Yet few theoretical results characterizing its statistical performance exist. This paper provides a theoretical framework for analyzing the quality of embedded samples produced by classical multidimensional scaling. This lays a foundation for various downstream statistical analyses, and we focus on clustering noisy data. Our results provide scaling conditions on the signal-to-noise ratio under which classical multidimensional scaling followed by a distance-based clustering algorithm can recover the cluster labels of all samples. Simulation studies confirm these scaling conditions are sharp. Applications to the cancer gene-expression data, the single-cell RNA sequencing data and the natural language data lend strong support to the methodology and theory.</p>","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2022-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9893760/pdf/iaac004.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9392159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Linear convergence of the subspace constrained mean shift algorithm: from Euclidean to directional data.","authors":"Yikun Zhang, Yen-Chi Chen","doi":"10.1093/imaiai/iaac005","DOIUrl":"10.1093/imaiai/iaac005","url":null,"abstract":"<p><p>This paper studies the linear convergence of the subspace constrained mean shift (SCMS) algorithm, a well-known algorithm for identifying a density ridge defined by a kernel density estimator. By arguing that the SCMS algorithm is a special variant of a subspace constrained gradient ascent (SCGA) algorithm with an adaptive step size, we derive the linear convergence of such SCGA algorithm. While the existing research focuses mainly on density ridges in the Euclidean space, we generalize density ridges and the SCMS algorithm to directional data. In particular, we establish the stability theorem of density ridges with directional data and prove the linear convergence of our proposed directional SCMS algorithm.</p>","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2022-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9893762/pdf/iaac005.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9316422","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Zero-truncated Poisson regression for sparse multiway count data corrupted by false zeros","authors":"Oscar L'opez, Daniel M. Dunlavy, R. Lehoucq","doi":"10.1093/imaiai/iaad016","DOIUrl":"https://doi.org/10.1093/imaiai/iaad016","url":null,"abstract":"\u0000 We propose a novel statistical inference methodology for multiway count data that is corrupted by false zeros that are indistinguishable from true zero counts. Our approach consists of zero-truncating the Poisson distribution to neglect all zero values. This simple truncated approach dispenses with the need to distinguish between true and false zero counts and reduces the amount of data to be processed. Inference is accomplished via tensor completion that imposes low-rank tensor structure on the Poisson parameter space. Our main result shows that an $N$-way rank-$R$ parametric tensor $boldsymbol{mathscr{M}}in (0,infty )^{Itimes cdots times I}$ generating Poisson observations can be accurately estimated by zero-truncated Poisson regression from approximately $IR^2log _2^2(I)$ non-zero counts under the nonnegative canonical polyadic decomposition. Our result also quantifies the error made by zero-truncating the Poisson distribution when the parameter is uniformly bounded from below. Therefore, under a low-rank multiparameter model, we propose an implementable approach guaranteed to achieve accurate regression in under-determined scenarios with substantial corruption by false zeros. Several numerical experiments are presented to explore the theoretical results.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2022-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89460693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"OUP accepted manuscript","authors":"","doi":"10.1093/imaiai/iaac012","DOIUrl":"https://doi.org/10.1093/imaiai/iaac012","url":null,"abstract":"","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78841945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"OUP accepted manuscript","authors":"","doi":"10.1093/imaiai/iaac007","DOIUrl":"https://doi.org/10.1093/imaiai/iaac007","url":null,"abstract":"","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87576118","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"OUP accepted manuscript","authors":"","doi":"10.1093/imaiai/iaac008","DOIUrl":"https://doi.org/10.1093/imaiai/iaac008","url":null,"abstract":"","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83796223","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"OUP accepted manuscript","authors":"","doi":"10.1093/imaiai/iaac011","DOIUrl":"https://doi.org/10.1093/imaiai/iaac011","url":null,"abstract":"","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80590832","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}