SIAM journal on mathematics of data science最新文献

筛选
英文 中文
Randomized Wasserstein Barycenter Computation: Resampling with Statistical Guarantees 随机Wasserstein重心计算:具有统计保证的重抽样
SIAM journal on mathematics of data science Pub Date : 2020-12-11 DOI: 10.1137/20m1385263
F. Heinemann, A. Munk, Y. Zemel
{"title":"Randomized Wasserstein Barycenter Computation: Resampling with Statistical Guarantees","authors":"F. Heinemann, A. Munk, Y. Zemel","doi":"10.1137/20m1385263","DOIUrl":"https://doi.org/10.1137/20m1385263","url":null,"abstract":"We propose a hybrid resampling method to approximate finitely supported Wasserstein barycenters on large-scale datasets, which can be combined with any exact solver. Nonasymptotic bounds on the expected error of the objective value as well as the barycenters themselves allow to calibrate computational cost and statistical accuracy. The rate of these upper bounds is shown to be optimal and independent of the underlying dimension, which appears only in the constants. Using a simple modification of the subgradient descent algorithm of Cuturi and Doucet, we showcase the applicability of our method on a myriad of simulated datasets, as well as a real-data example which are out of reach for state of the art algorithms for computing Wasserstein barycenters.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"69 1","pages":"229-259"},"PeriodicalIF":0.0,"publicationDate":"2020-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83296297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Stochastic Tverberg Theorems With Applications in Multiclass Logistic Regression, Separability, and Centerpoints of Data 随机Tverberg定理及其在多类逻辑回归、可分性和数据中心点中的应用
SIAM journal on mathematics of data science Pub Date : 2020-12-10 DOI: 10.1137/19m1277102
J. D. Loera, T. A. Hogan
{"title":"Stochastic Tverberg Theorems With Applications in Multiclass Logistic Regression, Separability, and Centerpoints of Data","authors":"J. D. Loera, T. A. Hogan","doi":"10.1137/19m1277102","DOIUrl":"https://doi.org/10.1137/19m1277102","url":null,"abstract":"We present new stochastic geometry theorems that give bounds on the probability that $m$ random data classes all contain a point in common in their convex hulls. These theorems relate to the existe...","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"116 1","pages":"1151-1166"},"PeriodicalIF":0.0,"publicationDate":"2020-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79373299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Binary Classification of Gaussian Mixtures: Abundance of Support Vectors, Benign Overfitting, and Regularization 高斯混合的二分类:支持向量的丰度、良性过拟合和正则化
SIAM journal on mathematics of data science Pub Date : 2020-11-18 DOI: 10.1137/21m1415121
Ke Wang, Christos Thrampoulidis
{"title":"Binary Classification of Gaussian Mixtures: Abundance of Support Vectors, Benign Overfitting, and Regularization","authors":"Ke Wang, Christos Thrampoulidis","doi":"10.1137/21m1415121","DOIUrl":"https://doi.org/10.1137/21m1415121","url":null,"abstract":"Deep neural networks generalize well despite being exceedingly overparameterized and being trained without explicit regularization. This curious phenomenon has inspired extensive research activity in establishing its statistical principles: Under what conditions is it observed? How do these depend on the data and on the training algorithm? When does regularization benefit generalization? While such questions remain wide open for deep neural nets, recent works have attempted gaining insights by studying simpler, often linear, models. Our paper contributes to this growing line of work by examining binary linear classification under a generative Gaussian mixture model. Motivated by recent results on the implicit bias of gradient descent, we study both max-margin SVM classifiers (corresponding to logistic loss) and min-norm interpolating classifiers (corresponding to least-squares loss). First, we leverage an idea introduced in [V. Muthukumar et al., arXiv:2005.08054, (2020)] to relate the SVM solution to the min-norm interpolating solution. Second, we derive novel non-asymptotic bounds on the classification error of the latter. Combining the two, we present novel sufficient conditions on the covariance spectrum and on the signal-to-noise ratio (SNR) under which interpolating estimators achieve asymptotically optimal performance as overparameterization increases. Interestingly, our results extend to a noisy model with constant probability noise flips. Contrary to previously studied discriminative data models, our results emphasize the crucial role of the SNR and its interplay with the data covariance. Finally, via a combination of analytical arguments and numerical demonstrations we identify conditions under which the interpolating estimator performs better than corresponding regularized estimates.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"20 1","pages":"260-284"},"PeriodicalIF":0.0,"publicationDate":"2020-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75813460","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Memory Capacity of Neural Networks with Threshold and Rectified Linear Unit Activations 具有阈值和整流线性单元激活的神经网络的记忆容量
SIAM journal on mathematics of data science Pub Date : 2020-10-20 DOI: 10.1137/20m1314884
R. Vershynin
{"title":"Memory Capacity of Neural Networks with Threshold and Rectified Linear Unit Activations","authors":"R. Vershynin","doi":"10.1137/20m1314884","DOIUrl":"https://doi.org/10.1137/20m1314884","url":null,"abstract":"Overwhelming theoretical and empirical evidence shows that mildly overparametrized neural networks---those with more connections than the size of the training data---are often able to memorize the ...","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"19 1","pages":"1004-1033"},"PeriodicalIF":0.0,"publicationDate":"2020-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76907136","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 41
Towards Compact Neural Networks via End-to-End Training: A Bayesian Tensor Approach with Automatic Rank Determination 基于端到端训练的紧凑神经网络:一种自动排序的贝叶斯张量方法
SIAM journal on mathematics of data science Pub Date : 2020-10-17 DOI: 10.1137/21m1391444
Cole Hawkins, Xing-er Liu, Zheng Zhang
{"title":"Towards Compact Neural Networks via End-to-End Training: A Bayesian Tensor Approach with Automatic Rank Determination","authors":"Cole Hawkins, Xing-er Liu, Zheng Zhang","doi":"10.1137/21m1391444","DOIUrl":"https://doi.org/10.1137/21m1391444","url":null,"abstract":"While post-training model compression can greatly reduce the inference cost of a deep neural network, uncompressed training still consumes a huge amount of hardware resources, run-time and energy. It is highly desirable to directly train a compact neural network from scratch with low memory and low computational cost. Low-rank tensor decomposition is one of the most effective approaches to reduce the memory and computing requirements of large-size neural networks. However, directly training a low-rank tensorized neural network is a very challenging task because it is hard to determine a proper tensor rank {it a priori}, which controls the model complexity and compression ratio in the training process. This paper presents a novel end-to-end framework for low-rank tensorized training of neural networks. We first develop a flexible Bayesian model that can handle various low-rank tensor formats (e.g., CP, Tucker, tensor train and tensor-train matrix) that compress neural network parameters in training. This model can automatically determine the tensor ranks inside a nonlinear forward model, which is beyond the capability of existing Bayesian tensor methods. We further develop a scalable stochastic variational inference solver to estimate the posterior density of large-scale problems in training. Our work provides the first general-purpose rank-adaptive framework for end-to-end tensorized training. Our numerical results on various neural network architectures show orders-of-magnitude parameter reduction and little accuracy loss (or even better accuracy) in the training process. Specifically, on a very large deep learning recommendation system with over $4.2times 10^9$ model parameters, our method can reduce the variables to only $1.6times 10^5$ automatically in the training process (i.e., by $2.6times 10^4$ times) while achieving almost the same accuracy.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"45 1","pages":"46-71"},"PeriodicalIF":0.0,"publicationDate":"2020-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80791672","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Consistency of Archetypal Analysis 原型分析的一致性
SIAM journal on mathematics of data science Pub Date : 2020-10-16 DOI: 10.1137/20M1331792
B. Osting, Dong Wang, Yiming Xu, Dominique Zosso
{"title":"Consistency of Archetypal Analysis","authors":"B. Osting, Dong Wang, Yiming Xu, Dominique Zosso","doi":"10.1137/20M1331792","DOIUrl":"https://doi.org/10.1137/20M1331792","url":null,"abstract":"Archetypal analysis is an unsupervised learning method that uses a convex polytope to summarize multivariate data. For fixed $k$, the method finds a convex polytope with $k$ vertices, called archetype points, such that the polytope is contained in the convex hull of the data and the mean squared distance between the data and the polytope is minimal. In this paper, we prove a consistency result that shows if the data is independently sampled from a probability measure with bounded support, then the archetype points converge to a solution of the continuum version of the problem, of which we identify and establish several properties. We also obtain the convergence rate of the optimal objective values under appropriate assumptions on the distribution. If the data is independently sampled from a distribution with unbounded support, we also prove a consistency result for a modified method that penalizes the dispersion of the archetype points. Our analysis is supported by detailed computational experiments of the archetype points for data sampled from the uniform distribution in a disk, the normal distribution, an annular distribution, and a Gaussian mixture model.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"50 1","pages":"1-30"},"PeriodicalIF":0.0,"publicationDate":"2020-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90037347","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Some Limit Properties of Markov Chains Induced by Recursive Stochastic Algorithms 由递推随机算法导出的马尔可夫链的一些极限性质
SIAM journal on mathematics of data science Pub Date : 2020-10-15 DOI: 10.1137/19m1258104
Abhishek K. Gupta, Hao Chen, Jianzong Pi, Gaurav Tendolkar
{"title":"Some Limit Properties of Markov Chains Induced by Recursive Stochastic Algorithms","authors":"Abhishek K. Gupta, Hao Chen, Jianzong Pi, Gaurav Tendolkar","doi":"10.1137/19m1258104","DOIUrl":"https://doi.org/10.1137/19m1258104","url":null,"abstract":"Recursive stochastic algorithms have gained significant attention in the recent past due to data-driven applications. Examples include stochastic gradient descent for solving large-scale optimizati...","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"97 1","pages":"967-1003"},"PeriodicalIF":0.0,"publicationDate":"2020-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79472710","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Multilayer Modularity Belief Propagation to Assess Detectability of Community Structure 基于多层模块化信念传播的群落结构可检测性评估
SIAM journal on mathematics of data science Pub Date : 2020-09-28 DOI: 10.1137/19m1279812
W. Weir, Benjamin Walker, Lenka Zdeborov'a, P. Mucha
{"title":"Multilayer Modularity Belief Propagation to Assess Detectability of Community Structure","authors":"W. Weir, Benjamin Walker, Lenka Zdeborov'a, P. Mucha","doi":"10.1137/19m1279812","DOIUrl":"https://doi.org/10.1137/19m1279812","url":null,"abstract":"Modularity-based community detection encompasses a number of widely used, efficient heuristics for identification of structure in networks. Recently, a belief propagation approach to modularity opt...","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"1 1","pages":"872-900"},"PeriodicalIF":0.0,"publicationDate":"2020-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90478008","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Sequential Construction and Dimension Reduction of Gaussian Processes Under Inequality Constraints 不等式约束下高斯过程的序列构造与降维
SIAM journal on mathematics of data science Pub Date : 2020-09-09 DOI: 10.1137/21m1407513
F. Bachoc, A. F. López-Lopera, O. Roustant
{"title":"Sequential Construction and Dimension Reduction of Gaussian Processes Under Inequality Constraints","authors":"F. Bachoc, A. F. López-Lopera, O. Roustant","doi":"10.1137/21m1407513","DOIUrl":"https://doi.org/10.1137/21m1407513","url":null,"abstract":"Accounting for inequality constraints, such as boundedness, monotonicity or convexity, is challenging when modeling costly-to-evaluate black box functions. In this regard, finite-dimensional Gaussian process (GP) models bring a valuable solution, as they guarantee that the inequality constraints are satisfied everywhere. Nevertheless, these models are currently restricted to small dimensional situations (up to dimension 5). Addressing this issue, we introduce the MaxMod algorithm that sequentially inserts one-dimensional knots or adds active variables, thereby performing at the same time dimension reduction and efficient knot allocation. We prove the convergence of this algorithm. In intermediary steps of the proof, we propose the notion of multi-affine extension and study its properties. We also prove the convergence of finite-dimensional GPs, when the knots are not dense in the input space, extending the recent literature. With simulated and real data, we demonstrate that the MaxMod algorithm remains efficient in higher dimension (at least in dimension 20), and has a smaller computational complexity than other constrained GP models from the state-of-the-art, to reach a given approximation error.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45269075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Exponential-Wrapped Distributions on Symmetric Spaces 对称空间上的指数包裹分布
SIAM journal on mathematics of data science Pub Date : 2020-09-04 DOI: 10.1137/21m1461551
Emmanuel Chevallier, Didong Li, Yulong Lu, D. Dunson
{"title":"Exponential-Wrapped Distributions on Symmetric Spaces","authors":"Emmanuel Chevallier, Didong Li, Yulong Lu, D. Dunson","doi":"10.1137/21m1461551","DOIUrl":"https://doi.org/10.1137/21m1461551","url":null,"abstract":". In many applications, the curvature of the space supporting the data makes the statistical modelling challenging. In this paper we discuss the construction and use of probability distributions wrapped around manifolds using exponential maps. These distributions have already been used on specific manifolds. We describe their construction in the unifying framework of affine locally symmetric spaces. Affine locally symmetric spaces are a broad class of manifolds containing many manifolds encountered in data sciences. We show that on these spaces, exponential-wrapped distributions enjoy interesting properties for practical use. We provide the generic expression of the Jacobian appearing in these distributions and compute it on two particular examples: Grassmannians and pseudo-hyperboloids. We illustrate the interest of such distributions in a classification experiment on simulated data.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47949469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信