Mixture models, robustness, and sum of squares proofs

Samuel B. Hopkins, Jerry Li
{"title":"Mixture models, robustness, and sum of squares proofs","authors":"Samuel B. Hopkins, Jerry Li","doi":"10.1145/3188745.3188748","DOIUrl":null,"url":null,"abstract":"We use the Sum of Squares method to develop new efficient algorithms for learning well-separated mixtures of Gaussians and robust mean estimation, both in high dimensions, that substantially improve upon the statistical guarantees achieved by previous efficient algorithms. Our contributions are: Mixture models with separated means: We study mixtures of poly(k)-many k-dimensional distributions where the means of every pair of distributions are separated by at least kε. In the special case of spherical Gaussian mixtures, we give a kO(1/ε)-time algorithm that learns the means assuming separation at least kε, for any ε> 0. This is the first algorithm to improve on greedy (“single-linkage”) and spectral clustering, breaking a long-standing barrier for efficient algorithms at separation k1/4. Robust estimation: When an unknown (1−ε)-fraction of X1,…,Xn are chosen from a sub-Gaussian distribution with mean µ but the remaining points are chosen adversarially, we give an algorithm recovering µ to error ε1−1/t in time kO(t), so long as sub-Gaussian-ness up to O(t) moments can be certified by a Sum of Squares proof. This is the first polynomial-time algorithm with guarantees approaching the information-theoretic limit for non-Gaussian distributions. Previous algorithms could not achieve error better than ε1/2. As a corollary, we achieve similar results for robust covariance estimation. Both of these results are based on a unified technique. Inspired by recent algorithms of Diakonikolas et al. in robust statistics, we devise an SDP based on the Sum of Squares method for the following setting: given X1,…,Xn ∈ ℝk for large k and n = poly(k) with the promise that a subset of X1,…,Xn were sampled from a probability distribution with bounded moments, recover some information about that distribution.","PeriodicalId":20593,"journal":{"name":"Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2017-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"158","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3188745.3188748","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 158

Abstract

We use the Sum of Squares method to develop new efficient algorithms for learning well-separated mixtures of Gaussians and robust mean estimation, both in high dimensions, that substantially improve upon the statistical guarantees achieved by previous efficient algorithms. Our contributions are: Mixture models with separated means: We study mixtures of poly(k)-many k-dimensional distributions where the means of every pair of distributions are separated by at least kε. In the special case of spherical Gaussian mixtures, we give a kO(1/ε)-time algorithm that learns the means assuming separation at least kε, for any ε> 0. This is the first algorithm to improve on greedy (“single-linkage”) and spectral clustering, breaking a long-standing barrier for efficient algorithms at separation k1/4. Robust estimation: When an unknown (1−ε)-fraction of X1,…,Xn are chosen from a sub-Gaussian distribution with mean µ but the remaining points are chosen adversarially, we give an algorithm recovering µ to error ε1−1/t in time kO(t), so long as sub-Gaussian-ness up to O(t) moments can be certified by a Sum of Squares proof. This is the first polynomial-time algorithm with guarantees approaching the information-theoretic limit for non-Gaussian distributions. Previous algorithms could not achieve error better than ε1/2. As a corollary, we achieve similar results for robust covariance estimation. Both of these results are based on a unified technique. Inspired by recent algorithms of Diakonikolas et al. in robust statistics, we devise an SDP based on the Sum of Squares method for the following setting: given X1,…,Xn ∈ ℝk for large k and n = poly(k) with the promise that a subset of X1,…,Xn were sampled from a probability distribution with bounded moments, recover some information about that distribution.
混合模型,鲁棒性和平方和证明
我们使用平方和方法来开发新的高效算法,用于学习良好分离的高斯混合和鲁棒平均估计,两者都是在高维上,大大提高了以前高效算法所实现的统计保证。我们的贡献是:具有分离均值的混合模型:我们研究多(k)-许多k维分布的混合,其中每对分布的均值至少相隔kε。在球形高斯混合的特殊情况下,我们给出了一个kO(1/ε)时间算法,该算法学习了假设分离至少为kε的均值,对于任何ε> 0。这是第一个改进贪婪(“单链接”)和谱聚类的算法,打破了在分离k1/4时高效算法的长期障碍。鲁棒性估计:当从均值为μ的亚高斯分布中选取未知的(1−ε)分数X1,…,Xn,而其余的点都是逆向选取时,我们给出了一种在kO(t)时间内恢复μ to误差ε1−1/t的算法,只要在O(t)阶矩以内的亚高斯性可以通过平方和证明得到证明。这是第一个多项式时间算法,保证接近非高斯分布的信息论极限。以往的算法均不能达到优于ε1/2的误差。作为推论,我们在稳健协方差估计上也得到了类似的结果。这两个结果都是基于一个统一的技术。受Diakonikolas等人在鲁棒统计中的最新算法的启发,我们设计了一种基于平方和方法的SDP,用于以下设置:给定X1,…,Xn∈∈k(大k), n = poly(k),并承诺从具有有界矩的概率分布中采样X1,…,Xn的子集,恢复该分布的一些信息。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
文献相关原料
公司名称 产品信息 采购帮参考价格
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信