Breaking the Limits of Subspace Inference

2018 56th Annual Allerton Conference on Communication, Control, and Computing (Allerton) Pub Date : 2018-10-01 DOI:10.1109/ALLERTON.2018.8635999

Claudia R. Solís-Lemus, Daniel L. Pimentel-Alarcón

{"title":"Breaking the Limits of Subspace Inference","authors":"Claudia R. Solís-Lemus, Daniel L. Pimentel-Alarcón","doi":"10.1109/ALLERTON.2018.8635999","DOIUrl":null,"url":null,"abstract":"Inferring low-dimensional subspaces that describe high-dimensional, highly incomplete datasets has become a routinely procedure in modern data science. This paper is about a curious phenomenon related to the amount of information required to estimate a subspace. On one hand, it has been shown that information-theoretically, data in $\\mathbb {R}^{\\mathrm {d}}$ must be observed on at least $\\ell =\\mathrm {r}+1$ coordinates to uniquely identify an r-dimensional subspace that approximates it. On the other hand, it is well- known that the subspace containing a dataset can be estimated through its sample covariance matrix, which only requires observing 2 coordinates per datapoint (regardless of $\\mathrm {r}!$). At first glance, this may seem to contradict the information-theoretic bound. The key lies in the subtle difference between identifiability (uniqueness) and estimation (most probable). It is true that if we only observed $\\ell \\leq \\mathrm {r}$ coordinates per datapoint, there will be infinitely many r-dimensional subspaces that perfectly agree with the observations. However, some subspaces may be more likely than others, which are revealed by the sample covariance. This raises several fundamental questions: what are the algebraic relationships hidden in 2 coordinates that allow estimating an r-dimensional subspace? Moreover, are $\\ell = 2$ coordinates per datapoint necessary for estimation, or is it possible with only $\\ell =1$? In this paper we show that under certain assumptions, it is possible to estimate some subspaces up to finite choice with as few as $\\ell =1$ entry per column. This paper raises the question of whether there exist other subspace estimation methods that allow $\\ell \\leq \\mathrm {r}$ coordinates per datapoint, and that are more efficient than the sample covariance, which converges slowly in the number of data points n.","PeriodicalId":299280,"journal":{"name":"2018 56th Annual Allerton Conference on Communication, Control, and Computing (Allerton)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 56th Annual Allerton Conference on Communication, Control, and Computing (Allerton)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ALLERTON.2018.8635999","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Inferring low-dimensional subspaces that describe high-dimensional, highly incomplete datasets has become a routinely procedure in modern data science. This paper is about a curious phenomenon related to the amount of information required to estimate a subspace. On one hand, it has been shown that information-theoretically, data in $\mathbb {R}^{\mathrm {d}}$ must be observed on at least $\ell =\mathrm {r}+1$ coordinates to uniquely identify an r-dimensional subspace that approximates it. On the other hand, it is well- known that the subspace containing a dataset can be estimated through its sample covariance matrix, which only requires observing 2 coordinates per datapoint (regardless of $\mathrm {r}!$). At first glance, this may seem to contradict the information-theoretic bound. The key lies in the subtle difference between identifiability (uniqueness) and estimation (most probable). It is true that if we only observed $\ell \leq \mathrm {r}$ coordinates per datapoint, there will be infinitely many r-dimensional subspaces that perfectly agree with the observations. However, some subspaces may be more likely than others, which are revealed by the sample covariance. This raises several fundamental questions: what are the algebraic relationships hidden in 2 coordinates that allow estimating an r-dimensional subspace? Moreover, are $\ell = 2$ coordinates per datapoint necessary for estimation, or is it possible with only $\ell =1$? In this paper we show that under certain assumptions, it is possible to estimate some subspaces up to finite choice with as few as $\ell =1$ entry per column. This paper raises the question of whether there exist other subspace estimation methods that allow $\ell \leq \mathrm {r}$ coordinates per datapoint, and that are more efficient than the sample covariance, which converges slowly in the number of data points n.

查看原文本刊更多论文

突破子空间推理的极限

推断描述高维、高度不完整数据集的低维子空间已成为现代数据科学中的常规程序。本文是关于一个与估计子空间所需的信息量有关的奇怪现象。一方面，已经证明了信息-理论上，$\mathbb {R}^{\mathrm {d}}$中的数据必须至少在$\ell =\mathrm {r}+1$坐标上观察到，才能唯一地识别近似它的r维子空间。另一方面，众所周知，包含数据集的子空间可以通过其样本协方差矩阵来估计，这只需要观察每个数据点的2个坐标(不管$\mathrm {r}!$)。乍一看，这似乎与信息论的界限相矛盾。关键在于可识别性(唯一性)和估计性(最可能)之间的细微差别。的确，如果我们只观察每个数据点的$\ell \leq \mathrm {r}$坐标，那么将会有无限多的r维子空间与观测结果完全一致。然而，一些子空间可能比其他子空间更有可能，这是由样本协方差揭示的。这就提出了几个基本问题:在2个坐标中隐藏的代数关系是什么，从而可以估计r维子空间?此外，每个数据点的坐标是估算所必需的$\ell = 2$，还是可能只使用$\ell =1$ ?在本文中，我们证明了在一定的假设下，可以估计一些子空间到有限的选择，每列只有$\ell =1$条目。本文提出的问题是，是否存在其他子空间估计方法允许每个数据点$\ell \leq \mathrm {r}$坐标，并且比样本协方差更有效，因为样本协方差在数据点数n上收敛缓慢。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2018 56th Annual Allerton Conference on Communication, Control, and Computing (Allerton)

自引率

0.00%

发文量