Breaking the Limits of Subspace Inference

Claudia R. Solís-Lemus, Daniel L. Pimentel-Alarcón
{"title":"Breaking the Limits of Subspace Inference","authors":"Claudia R. Solís-Lemus, Daniel L. Pimentel-Alarcón","doi":"10.1109/ALLERTON.2018.8635999","DOIUrl":null,"url":null,"abstract":"Inferring low-dimensional subspaces that describe high-dimensional, highly incomplete datasets has become a routinely procedure in modern data science. This paper is about a curious phenomenon related to the amount of information required to estimate a subspace. On one hand, it has been shown that information-theoretically, data in $\\mathbb {R}^{\\mathrm {d}}$ must be observed on at least $\\ell =\\mathrm {r}+1$ coordinates to uniquely identify an r-dimensional subspace that approximates it. On the other hand, it is well- known that the subspace containing a dataset can be estimated through its sample covariance matrix, which only requires observing 2 coordinates per datapoint (regardless of $\\mathrm {r}!$). At first glance, this may seem to contradict the information-theoretic bound. The key lies in the subtle difference between identifiability (uniqueness) and estimation (most probable). It is true that if we only observed $\\ell \\leq \\mathrm {r}$ coordinates per datapoint, there will be infinitely many r-dimensional subspaces that perfectly agree with the observations. However, some subspaces may be more likely than others, which are revealed by the sample covariance. This raises several fundamental questions: what are the algebraic relationships hidden in 2 coordinates that allow estimating an r-dimensional subspace? Moreover, are $\\ell = 2$ coordinates per datapoint necessary for estimation, or is it possible with only $\\ell =1$? In this paper we show that under certain assumptions, it is possible to estimate some subspaces up to finite choice with as few as $\\ell =1$ entry per column. This paper raises the question of whether there exist other subspace estimation methods that allow $\\ell \\leq \\mathrm {r}$ coordinates per datapoint, and that are more efficient than the sample covariance, which converges slowly in the number of data points n.","PeriodicalId":299280,"journal":{"name":"2018 56th Annual Allerton Conference on Communication, Control, and Computing (Allerton)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 56th Annual Allerton Conference on Communication, Control, and Computing (Allerton)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ALLERTON.2018.8635999","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Inferring low-dimensional subspaces that describe high-dimensional, highly incomplete datasets has become a routinely procedure in modern data science. This paper is about a curious phenomenon related to the amount of information required to estimate a subspace. On one hand, it has been shown that information-theoretically, data in $\mathbb {R}^{\mathrm {d}}$ must be observed on at least $\ell =\mathrm {r}+1$ coordinates to uniquely identify an r-dimensional subspace that approximates it. On the other hand, it is well- known that the subspace containing a dataset can be estimated through its sample covariance matrix, which only requires observing 2 coordinates per datapoint (regardless of $\mathrm {r}!$). At first glance, this may seem to contradict the information-theoretic bound. The key lies in the subtle difference between identifiability (uniqueness) and estimation (most probable). It is true that if we only observed $\ell \leq \mathrm {r}$ coordinates per datapoint, there will be infinitely many r-dimensional subspaces that perfectly agree with the observations. However, some subspaces may be more likely than others, which are revealed by the sample covariance. This raises several fundamental questions: what are the algebraic relationships hidden in 2 coordinates that allow estimating an r-dimensional subspace? Moreover, are $\ell = 2$ coordinates per datapoint necessary for estimation, or is it possible with only $\ell =1$? In this paper we show that under certain assumptions, it is possible to estimate some subspaces up to finite choice with as few as $\ell =1$ entry per column. This paper raises the question of whether there exist other subspace estimation methods that allow $\ell \leq \mathrm {r}$ coordinates per datapoint, and that are more efficient than the sample covariance, which converges slowly in the number of data points n.
突破子空间推理的极限
推断描述高维、高度不完整数据集的低维子空间已成为现代数据科学中的常规程序。本文是关于一个与估计子空间所需的信息量有关的奇怪现象。一方面,已经证明了信息-理论上,$\mathbb {R}^{\mathrm {d}}$中的数据必须至少在$\ell =\mathrm {r}+1$坐标上观察到,才能唯一地识别近似它的r维子空间。另一方面,众所周知,包含数据集的子空间可以通过其样本协方差矩阵来估计,这只需要观察每个数据点的2个坐标(不管$\mathrm {r}!$)。乍一看,这似乎与信息论的界限相矛盾。关键在于可识别性(唯一性)和估计性(最可能)之间的细微差别。的确,如果我们只观察每个数据点的$\ell \leq \mathrm {r}$坐标,那么将会有无限多的r维子空间与观测结果完全一致。然而,一些子空间可能比其他子空间更有可能,这是由样本协方差揭示的。这就提出了几个基本问题:在2个坐标中隐藏的代数关系是什么,从而可以估计r维子空间?此外,每个数据点的坐标是估算所必需的$\ell = 2$,还是可能只使用$\ell =1$ ?在本文中,我们证明了在一定的假设下,可以估计一些子空间到有限的选择,每列只有$\ell =1$条目。本文提出的问题是,是否存在其他子空间估计方法允许每个数据点$\ell \leq \mathrm {r}$坐标,并且比样本协方差更有效,因为样本协方差在数据点数n上收敛缓慢。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信