表可译码鲁棒平均估计和球形高斯的学习混合

Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing Pub Date : 2017-11-20 DOI:10.1145/3188745.3188758

Ilias Diakonikolas, D. Kane, Alistair Stewart

{"title":"表可译码鲁棒平均估计和球形高斯的学习混合","authors":"Ilias Diakonikolas, D. Kane, Alistair Stewart","doi":"10.1145/3188745.3188758","DOIUrl":null,"url":null,"abstract":"We study the problem of list-decodable (robust) Gaussian mean estimation and the related problem of learning mixtures of separated spherical Gaussians. In the former problem, we are given a set T of points in n with the promise that an α-fraction of points in T, where 0< α < 1/2, are drawn from an unknown mean identity covariance Gaussian G, and no assumptions are made about the remaining points. The goal is to output a small list of candidate vectors with the guarantee that at least one of the candidates is close to the mean of G. In the latter problem, we are given samples from a k-mixture of spherical Gaussians on n and the goal is to estimate the unknown model parameters up to small accuracy. We develop a set of techniques that yield new efficient algorithms with significantly improved guarantees for these problems. Specifically, our main contributions are as follows: List-Decodable Mean Estimation. Fix any d ∈ + and 0< α <1/2. We design an algorithm with sample complexity Od ((nd/α)) and runtime Od ((n/α)d) that outputs a list of O(1/α) many candidate vectors such that with high probability one of the candidates is within ℓ2-distance Od(α−1/(2d)) from the mean of G. The only previous algorithm for this problem achieved error Õ(α−1/2) under second moment conditions. For d = O(1/), where >0 is a constant, our algorithm runs in polynomial time and achieves error O(α). For d = Θ(log(1/α)), our algorithm runs in time (n/α)O(log(1/α)) and achieves error O(log3/2(1/α)), almost matching the information-theoretically optimal bound of Θ(log1/2(1/α)) that we establish. We also give a Statistical Query (SQ) lower bound suggesting that the complexity of our algorithm is qualitatively close to best possible. Learning Mixtures of Spherical Gaussians. We give a learning algorithm for mixtures of spherical Gaussians, with unknown spherical covariances, that succeeds under significantly weaker separation assumptions compared to prior work. For the prototypical case of a uniform k-mixture of identity covariance Gaussians we obtain the following: For any >0, if the pairwise separation between the means is at least Ω(k+√log(1/δ)), our algorithm learns the unknown parameters within accuracy δ with sample complexity and running time (n, 1/δ, (k/)1/). Moreover, our algorithm is robust to a small dimension-independent fraction of corrupted data. The previously best known polynomial time algorithm required separation at least k1/4 (k/δ). Finally, our algorithm works under separation of Õ(log3/2(k)+√log(1/δ)) with sample complexity and running time (n, 1/δ, klogk). This bound is close to the information-theoretically minimum separation of Ω(√logk). Our main technical contribution is a new technique, using degree-d multivariate polynomials, to remove outliers from high-dimensional datasets where the majority of the points are corrupted.","PeriodicalId":20593,"journal":{"name":"Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing","volume":"31 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2017-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"126","resultStr":"{\"title\":\"List-decodable robust mean estimation and learning mixtures of spherical gaussians\",\"authors\":\"Ilias Diakonikolas, D. Kane, Alistair Stewart\",\"doi\":\"10.1145/3188745.3188758\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We study the problem of list-decodable (robust) Gaussian mean estimation and the related problem of learning mixtures of separated spherical Gaussians. In the former problem, we are given a set T of points in n with the promise that an α-fraction of points in T, where 0< α < 1/2, are drawn from an unknown mean identity covariance Gaussian G, and no assumptions are made about the remaining points. The goal is to output a small list of candidate vectors with the guarantee that at least one of the candidates is close to the mean of G. In the latter problem, we are given samples from a k-mixture of spherical Gaussians on n and the goal is to estimate the unknown model parameters up to small accuracy. We develop a set of techniques that yield new efficient algorithms with significantly improved guarantees for these problems. Specifically, our main contributions are as follows: List-Decodable Mean Estimation. Fix any d ∈ + and 0< α <1/2. We design an algorithm with sample complexity Od ((nd/α)) and runtime Od ((n/α)d) that outputs a list of O(1/α) many candidate vectors such that with high probability one of the candidates is within ℓ2-distance Od(α−1/(2d)) from the mean of G. The only previous algorithm for this problem achieved error Õ(α−1/2) under second moment conditions. For d = O(1/), where >0 is a constant, our algorithm runs in polynomial time and achieves error O(α). For d = Θ(log(1/α)), our algorithm runs in time (n/α)O(log(1/α)) and achieves error O(log3/2(1/α)), almost matching the information-theoretically optimal bound of Θ(log1/2(1/α)) that we establish. We also give a Statistical Query (SQ) lower bound suggesting that the complexity of our algorithm is qualitatively close to best possible. Learning Mixtures of Spherical Gaussians. We give a learning algorithm for mixtures of spherical Gaussians, with unknown spherical covariances, that succeeds under significantly weaker separation assumptions compared to prior work. For the prototypical case of a uniform k-mixture of identity covariance Gaussians we obtain the following: For any >0, if the pairwise separation between the means is at least Ω(k+√log(1/δ)), our algorithm learns the unknown parameters within accuracy δ with sample complexity and running time (n, 1/δ, (k/)1/). Moreover, our algorithm is robust to a small dimension-independent fraction of corrupted data. The previously best known polynomial time algorithm required separation at least k1/4 (k/δ). Finally, our algorithm works under separation of Õ(log3/2(k)+√log(1/δ)) with sample complexity and running time (n, 1/δ, klogk). This bound is close to the information-theoretically minimum separation of Ω(√logk). Our main technical contribution is a new technique, using degree-d multivariate polynomials, to remove outliers from high-dimensional datasets where the majority of the points are corrupted.\",\"PeriodicalId\":20593,\"journal\":{\"name\":\"Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing\",\"volume\":\"31 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-11-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"126\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3188745.3188758\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3188745.3188758","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 126

摘要

我们研究了列表可解码(鲁棒)高斯均值估计问题以及分离球面高斯分布的学习混合问题。在前一个问题中，我们给定n中的点的集合T，并承诺T中0< α < 1/2的点的α-分数是从未知的平均单位协方差高斯G中提取的，并且对其余点不做任何假设。目标是输出一个候选向量的小列表，并保证至少有一个候选向量接近g的平均值。在后一个问题中，我们从n上的k-混合球面高斯中获得样本，目标是估计未知模型参数的精度很小。我们开发了一套技术，这些技术产生了新的高效算法，大大改善了对这些问题的保证。具体来说，我们的主要贡献如下:列表可解码平均估计。固定任意d∈+且0< α 0为常数，我们的算法运行时间为多项式，误差为O(α)。对于d = Θ(log(1/α))，我们的算法运行时间为(n/α)O(log(1/α))，误差为O(log3/2(1/α))，几乎与我们建立的信息理论最优界Θ(log1/2(1/α))相匹配。我们还给出了一个统计查询(SQ)下界，表明我们的算法的复杂性在质量上接近最佳可能。球状高斯函数的学习混合。我们给出了一个球形高斯混合的学习算法，具有未知的球形协方差，与先前的工作相比，它在明显较弱的分离假设下成功。对于恒等协方差高斯均匀k-混合的典型情况，我们得到以下结果:对于任何>0的情况，如果均值之间的两两分离至少为Ω(k+√log(1/δ))，我们的算法学习精度δ内的未知参数，样本复杂度和运行时间(n, 1/δ， (k/)1/)。此外，我们的算法对一小部分与维无关的损坏数据具有鲁棒性。以前最著名的多项式时间算法需要至少k1/4 (k/δ)的分离。最后，我们的算法在Õ(log3/2(k)+√log(1/δ))与样本复杂度和运行时间(n, 1/δ， klogk)分离的情况下工作。这个边界接近于Ω(√logk)的信息理论最小分离。我们的主要技术贡献是一项新技术，使用次多元多项式，从大多数点损坏的高维数据集中去除异常值。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

List-decodable robust mean estimation and learning mixtures of spherical gaussians

We study the problem of list-decodable (robust) Gaussian mean estimation and the related problem of learning mixtures of separated spherical Gaussians. In the former problem, we are given a set T of points in n with the promise that an α-fraction of points in T, where 0< α < 1/2, are drawn from an unknown mean identity covariance Gaussian G, and no assumptions are made about the remaining points. The goal is to output a small list of candidate vectors with the guarantee that at least one of the candidates is close to the mean of G. In the latter problem, we are given samples from a k-mixture of spherical Gaussians on n and the goal is to estimate the unknown model parameters up to small accuracy. We develop a set of techniques that yield new efficient algorithms with significantly improved guarantees for these problems. Specifically, our main contributions are as follows: List-Decodable Mean Estimation. Fix any d ∈ + and 0< α <1/2. We design an algorithm with sample complexity Od ((nd/α)) and runtime Od ((n/α)d) that outputs a list of O(1/α) many candidate vectors such that with high probability one of the candidates is within ℓ2-distance Od(α−1/(2d)) from the mean of G. The only previous algorithm for this problem achieved error Õ(α−1/2) under second moment conditions. For d = O(1/), where >0 is a constant, our algorithm runs in polynomial time and achieves error O(α). For d = Θ(log(1/α)), our algorithm runs in time (n/α)O(log(1/α)) and achieves error O(log3/2(1/α)), almost matching the information-theoretically optimal bound of Θ(log1/2(1/α)) that we establish. We also give a Statistical Query (SQ) lower bound suggesting that the complexity of our algorithm is qualitatively close to best possible. Learning Mixtures of Spherical Gaussians. We give a learning algorithm for mixtures of spherical Gaussians, with unknown spherical covariances, that succeeds under significantly weaker separation assumptions compared to prior work. For the prototypical case of a uniform k-mixture of identity covariance Gaussians we obtain the following: For any >0, if the pairwise separation between the means is at least Ω(k+√log(1/δ)), our algorithm learns the unknown parameters within accuracy δ with sample complexity and running time (n, 1/δ, (k/)1/). Moreover, our algorithm is robust to a small dimension-independent fraction of corrupted data. The previously best known polynomial time algorithm required separation at least k1/4 (k/δ). Finally, our algorithm works under separation of Õ(log3/2(k)+√log(1/δ)) with sample complexity and running time (n, 1/δ, klogk). This bound is close to the information-theoretically minimum separation of Ω(√logk). Our main technical contribution is a new technique, using degree-d multivariate polynomials, to remove outliers from high-dimensional datasets where the majority of the points are corrupted.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing

自引率

0.00%

发文量