Claudia R. Solís-Lemus, Daniel L. Pimentel-Alarcón
{"title":"Breaking the Limits of Subspace Inference","authors":"Claudia R. Solís-Lemus, Daniel L. Pimentel-Alarcón","doi":"10.1109/ALLERTON.2018.8635999","DOIUrl":null,"url":null,"abstract":"Inferring low-dimensional subspaces that describe high-dimensional, highly incomplete datasets has become a routinely procedure in modern data science. This paper is about a curious phenomenon related to the amount of information required to estimate a subspace. On one hand, it has been shown that information-theoretically, data in $\\mathbb {R}^{\\mathrm {d}}$ must be observed on at least $\\ell =\\mathrm {r}+1$ coordinates to uniquely identify an r-dimensional subspace that approximates it. On the other hand, it is well- known that the subspace containing a dataset can be estimated through its sample covariance matrix, which only requires observing 2 coordinates per datapoint (regardless of $\\mathrm {r}!$). At first glance, this may seem to contradict the information-theoretic bound. The key lies in the subtle difference between identifiability (uniqueness) and estimation (most probable). It is true that if we only observed $\\ell \\leq \\mathrm {r}$ coordinates per datapoint, there will be infinitely many r-dimensional subspaces that perfectly agree with the observations. However, some subspaces may be more likely than others, which are revealed by the sample covariance. This raises several fundamental questions: what are the algebraic relationships hidden in 2 coordinates that allow estimating an r-dimensional subspace? Moreover, are $\\ell = 2$ coordinates per datapoint necessary for estimation, or is it possible with only $\\ell =1$? In this paper we show that under certain assumptions, it is possible to estimate some subspaces up to finite choice with as few as $\\ell =1$ entry per column. This paper raises the question of whether there exist other subspace estimation methods that allow $\\ell \\leq \\mathrm {r}$ coordinates per datapoint, and that are more efficient than the sample covariance, which converges slowly in the number of data points n.","PeriodicalId":299280,"journal":{"name":"2018 56th Annual Allerton Conference on Communication, Control, and Computing (Allerton)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 56th Annual Allerton Conference on Communication, Control, and Computing (Allerton)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ALLERTON.2018.8635999","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Inferring low-dimensional subspaces that describe high-dimensional, highly incomplete datasets has become a routinely procedure in modern data science. This paper is about a curious phenomenon related to the amount of information required to estimate a subspace. On one hand, it has been shown that information-theoretically, data in $\mathbb {R}^{\mathrm {d}}$ must be observed on at least $\ell =\mathrm {r}+1$ coordinates to uniquely identify an r-dimensional subspace that approximates it. On the other hand, it is well- known that the subspace containing a dataset can be estimated through its sample covariance matrix, which only requires observing 2 coordinates per datapoint (regardless of $\mathrm {r}!$). At first glance, this may seem to contradict the information-theoretic bound. The key lies in the subtle difference between identifiability (uniqueness) and estimation (most probable). It is true that if we only observed $\ell \leq \mathrm {r}$ coordinates per datapoint, there will be infinitely many r-dimensional subspaces that perfectly agree with the observations. However, some subspaces may be more likely than others, which are revealed by the sample covariance. This raises several fundamental questions: what are the algebraic relationships hidden in 2 coordinates that allow estimating an r-dimensional subspace? Moreover, are $\ell = 2$ coordinates per datapoint necessary for estimation, or is it possible with only $\ell =1$? In this paper we show that under certain assumptions, it is possible to estimate some subspaces up to finite choice with as few as $\ell =1$ entry per column. This paper raises the question of whether there exist other subspace estimation methods that allow $\ell \leq \mathrm {r}$ coordinates per datapoint, and that are more efficient than the sample covariance, which converges slowly in the number of data points n.