SIAM journal on mathematics of data science最新文献

筛选
英文 中文
Two Steps at a Time---Taking GAN Training in Stride with Tseng's Method 一次两步——用曾氏方法进行GAN训练
SIAM journal on mathematics of data science Pub Date : 2020-06-16 DOI: 10.1137/21m1420939
A. Böhm, Michael Sedlmayer, E. R. Csetnek, R. Boț
{"title":"Two Steps at a Time---Taking GAN Training in Stride with Tseng's Method","authors":"A. Böhm, Michael Sedlmayer, E. R. Csetnek, R. Boț","doi":"10.1137/21m1420939","DOIUrl":"https://doi.org/10.1137/21m1420939","url":null,"abstract":"Motivated by the training of Generative Adversarial Networks (GANs), we study methods for solving minimax problems with additional nonsmooth regularizers. We do so by employing emph{monotone operator} theory, in particular the emph{Forward-Backward-Forward (FBF)} method, which avoids the known issue of limit cycling by correcting each update by a second gradient evaluation. Furthermore, we propose a seemingly new scheme which recycles old gradients to mitigate the additional computational cost. In doing so we rediscover a known method, related to emph{Optimistic Gradient Descent Ascent (OGDA)}. For both schemes we prove novel convergence rates for convex-concave minimax problems via a unifying approach. The derived error bounds are in terms of the gap function for the ergodic iterates. For the deterministic and the stochastic problem we show a convergence rate of $mathcal{O}(1/k)$ and $mathcal{O}(1/sqrt{k})$, respectively. We complement our theoretical results with empirical improvements in the training of Wasserstein GANs on the CIFAR10 dataset.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44505523","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
FANOK: Knockoffs in Linear Time FANOK:线性时间的仿冒品
SIAM journal on mathematics of data science Pub Date : 2020-06-15 DOI: 10.1137/20m1363698
Armin Askari, Quentin Rebjock, A. d’Aspremont, L. Ghaoui
{"title":"FANOK: Knockoffs in Linear Time","authors":"Armin Askari, Quentin Rebjock, A. d’Aspremont, L. Ghaoui","doi":"10.1137/20m1363698","DOIUrl":"https://doi.org/10.1137/20m1363698","url":null,"abstract":"We describe a series of algorithms that efficiently implement Gaussian model-X knockoffs to control the false discovery rate on large scale feature selection problems. Identifying the knockoff distribution requires solving a large scale semidefinite program for which we derive several efficient methods. One handles generic covariance matrices, has a complexity scaling as $O(p^3)$ where $p$ is the ambient dimension, while another assumes a rank $k$ factor model on the covariance matrix to reduce this complexity bound to $O(pk^2)$. We also derive efficient procedures to both estimate factor models and sample knockoff covariates with complexity linear in the dimension. We test our methods on problems with $p$ as large as $500,000$.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"22 1","pages":"833-853"},"PeriodicalIF":0.0,"publicationDate":"2020-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86856742","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Overparameterization and Generalization Error: Weighted Trigonometric Interpolation 超参数化与广义误差:加权三角插值
SIAM journal on mathematics of data science Pub Date : 2020-06-15 DOI: 10.1137/21m1390955
Yuege Xie, H. Chou, H. Rauhut, Rachel A. Ward
{"title":"Overparameterization and Generalization Error: Weighted Trigonometric Interpolation","authors":"Yuege Xie, H. Chou, H. Rauhut, Rachel A. Ward","doi":"10.1137/21m1390955","DOIUrl":"https://doi.org/10.1137/21m1390955","url":null,"abstract":"Motivated by surprisingly good generalization properties of learned deep neural networks in overparameterized scenarios and by the related double descent phenomenon, this paper analyzes the relation between smoothness and low generalization error in an overparameterized linear learning problem. We study a random Fourier series model, where the task is to estimate the unknown Fourier coefficients from equidistant samples. We derive exact expressions for the generalization error of both plain and weighted least squares estimators. We show precisely how a bias towards smooth interpolants, in the form of weighted trigonometric interpolation, can lead to smaller generalization error in the overparameterized regime compared to the underparameterized regime. This provides insight into the power of overparameterization, which is common in modern machine learning.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47917183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
The Trimmed Lasso: Sparse Recovery Guarantees and Practical Optimization by the Generalized Soft-Min Penalty 修剪套索:稀疏恢复保证和广义软最小惩罚的实际优化
SIAM journal on mathematics of data science Pub Date : 2020-05-18 DOI: 10.1137/20M1330634
Tal Amir, R. Basri, B. Nadler
{"title":"The Trimmed Lasso: Sparse Recovery Guarantees and Practical Optimization by the Generalized Soft-Min Penalty","authors":"Tal Amir, R. Basri, B. Nadler","doi":"10.1137/20M1330634","DOIUrl":"https://doi.org/10.1137/20M1330634","url":null,"abstract":"We present a new approach to solve the sparse approximation or best subset selection problem, namely find a $k$-sparse vector ${bf x}inmathbb{R}^d$ that minimizes the $ell_2$ residual $lVert A{bf x}-{bf y} rVert_2$. We consider a regularized approach, whereby this residual is penalized by the non-convex $textit{trimmed lasso}$, defined as the $ell_1$-norm of ${bf x}$ excluding its $k$ largest-magnitude entries. We prove that the trimmed lasso has several appealing theoretical properties, and in particular derive sparse recovery guarantees assuming successful optimization of the penalized objective. Next, we show empirically that directly optimizing this objective can be quite challenging. Instead, we propose a surrogate for the trimmed lasso, called the $textit{generalized soft-min}$. This penalty smoothly interpolates between the classical lasso and the trimmed lasso, while taking into account all possible $k$-sparse patterns. The generalized soft-min penalty involves summation over $binom{d}{k}$ terms, yet we derive a polynomial-time algorithm to compute it. This, in turn, yields a practical method for the original sparse approximation problem. Via simulations, we demonstrate its competitive performance compared to current state of the art.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"47 10 1","pages":"900-929"},"PeriodicalIF":0.0,"publicationDate":"2020-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80619433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Spectral Discovery of Jointly Smooth Features for Multimodal Data 多模态数据联合光滑特征的光谱发现
SIAM journal on mathematics of data science Pub Date : 2020-04-09 DOI: 10.1137/21M141590X
Or Yair, Felix Dietrich, Rotem Mulayoff, R. Talmon, I. Kevrekidis
{"title":"Spectral Discovery of Jointly Smooth Features for Multimodal Data","authors":"Or Yair, Felix Dietrich, Rotem Mulayoff, R. Talmon, I. Kevrekidis","doi":"10.1137/21M141590X","DOIUrl":"https://doi.org/10.1137/21M141590X","url":null,"abstract":"In this paper, we propose a spectral method for deriving functions that are jointly smooth on multiple observed manifolds. Our method is unsupervised and primarily consists of two steps. First, using kernels, we obtain a subspace spanning smooth functions on each manifold. Then, we apply a spectral method to the obtained subspaces and discover functions that are jointly smooth on all manifolds. We show analytically that our method is guaranteed to provide a set of orthogonal functions that are as jointly smooth as possible, ordered from the smoothest to the least smooth. In addition, we show that the proposed method can be efficiently extended to unseen data using the Nystrom method. We demonstrate the proposed method on both simulated and real measured data and compare the results to nonlinear variants of the seminal Canonical Correlation Analysis (CCA). Particularly, we show superior results for sleep stage identification. In addition, we show how the proposed method can be leveraged for finding minimal realizations of parameter spaces of nonlinear dynamical systems.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"1 1","pages":"410-430"},"PeriodicalIF":0.0,"publicationDate":"2020-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90987478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Is Temporal Difference Learning Optimal? An Instance-Dependent Analysis 时间差异学习是最优的吗?依赖实例的分析
SIAM journal on mathematics of data science Pub Date : 2020-03-16 DOI: 10.1137/20m1331524
K. Khamaru, A. Pananjady, Feng Ruan, M. Wainwright, Michael I. Jordan
{"title":"Is Temporal Difference Learning Optimal? An Instance-Dependent Analysis","authors":"K. Khamaru, A. Pananjady, Feng Ruan, M. Wainwright, Michael I. Jordan","doi":"10.1137/20m1331524","DOIUrl":"https://doi.org/10.1137/20m1331524","url":null,"abstract":"We address the problem of policy evaluation in discounted Markov decision processes, and provide instance-dependent guarantees on the $ell_infty$-error under a generative model. We establish both asymptotic and non-asymptotic versions of local minimax lower bounds for policy evaluation, thereby providing an instance-dependent baseline by which to compare algorithms. Theory-inspired simulations show that the widely-used temporal difference (TD) algorithm is strictly suboptimal when evaluated in a non-asymptotic setting, even when combined with Polyak-Ruppert iterate averaging. We remedy this issue by introducing and analyzing variance-reduced forms of stochastic approximation, showing that they achieve non-asymptotic, instance-dependent optimality up to logarithmic factors.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"1 1","pages":"1013-1040"},"PeriodicalIF":0.0,"publicationDate":"2020-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73570067","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 39
Diffusion State Distances: Multitemporal Analysis, Fast Algorithms, and Applications to Biological Networks 扩散状态距离:多时间分析、快速算法及在生物网络中的应用
SIAM journal on mathematics of data science Pub Date : 2020-03-07 DOI: 10.1137/20M1324089
L. Cowen, K. Devkota, Xiaozhe Hu, James M. Murphy, Kaiyi Wu
{"title":"Diffusion State Distances: Multitemporal Analysis, Fast Algorithms, and Applications to Biological Networks","authors":"L. Cowen, K. Devkota, Xiaozhe Hu, James M. Murphy, Kaiyi Wu","doi":"10.1137/20M1324089","DOIUrl":"https://doi.org/10.1137/20M1324089","url":null,"abstract":"Data-dependent metrics are powerful tools for learning the underlying structure of high-dimensional data. This article develops and analyzes a data-dependent metric known as diffusion state distance (DSD), which compares points using a data-driven diffusion process. Unlike related diffusion methods, DSDs incorporate information across time scales, which allows for the intrinsic data structure to be inferred in a parameter-free manner. This article develops a theory for DSD based on the multitemporal emergence of mesoscopic equilibria in the underlying diffusion process. New algorithms for denoising and dimension reduction with DSD are also proposed and analyzed. These approaches are based on a weighted spectral decomposition of the underlying diffusion process, and experiments on synthetic datasets and real biological networks illustrate the efficacy of the proposed algorithms in terms of both speed and accuracy. Throughout, comparisons with related methods are made, in order to illustrate the distinct advantages of DSD for datasets exhibiting multiscale structure.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"32 1","pages":"142-170"},"PeriodicalIF":0.0,"publicationDate":"2020-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80648699","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Diversity sampling is an implicit regularization for kernel methods 多样性采样是核方法的隐式正则化
SIAM journal on mathematics of data science Pub Date : 2020-02-20 DOI: 10.1137/20M1320031
M. Fanuel, J. Schreurs, J. Suykens
{"title":"Diversity sampling is an implicit regularization for kernel methods","authors":"M. Fanuel, J. Schreurs, J. Suykens","doi":"10.1137/20M1320031","DOIUrl":"https://doi.org/10.1137/20M1320031","url":null,"abstract":"Kernel methods have achieved very good performance on large scale regression and classification problems, by using the Nystrom method and preconditioning techniques. The Nystrom approximation -- based on a subset of landmarks -- gives a low rank approximation of the kernel matrix, and is known to provide a form of implicit regularization. We further elaborate on the impact of sampling diverse landmarks for constructing the Nystrom approximation in supervised as well as unsupervised kernel methods. By using Determinantal Point Processes for sampling, we obtain additional theoretical results concerning the interplay between diversity and regularization. Empirically, we demonstrate the advantages of training kernel methods based on subsets made of diverse points. In particular, if the dataset has a dense bulk and a sparser tail, we show that Nystrom kernel regression with diverse landmarks increases the accuracy of the regression in sparser regions of the dataset, with respect to a uniform landmark sampling. A greedy heuristic is also proposed to select diverse samples of significant size within large datasets when exact DPP sampling is not practically feasible.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"9 1","pages":"280-297"},"PeriodicalIF":0.0,"publicationDate":"2020-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85923398","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Adaptivity of Stochastic Gradient Methods for Nonconvex Optimization 随机梯度法在非凸优化中的自适应性
SIAM journal on mathematics of data science Pub Date : 2020-02-13 DOI: 10.1137/21m1394308
Samuel Horváth, Lihua Lei, Peter Richtárik, Michael I. Jordan
{"title":"Adaptivity of Stochastic Gradient Methods for Nonconvex Optimization","authors":"Samuel Horváth, Lihua Lei, Peter Richtárik, Michael I. Jordan","doi":"10.1137/21m1394308","DOIUrl":"https://doi.org/10.1137/21m1394308","url":null,"abstract":"Adaptivity is an important yet under-studied property in modern optimization theory. The gap between the state-of-the-art theory and the current practice is striking in that algorithms with desirable theoretical guarantees typically involve drastically different settings of hyperparameters, such as step-size schemes and batch sizes, in different regimes. Despite the appealing theoretical results, such divisive strategies provide little, if any, insight to practitioners to select algorithms that work broadly without tweaking the hyperparameters. In this work, blending the \"geometrization\" technique introduced by Lei & Jordan 2016 and the texttt{SARAH} algorithm of Nguyen et al., 2017, we propose the Geometrized texttt{SARAH} algorithm for non-convex finite-sum and stochastic optimization. Our algorithm is proved to achieve adaptivity to both the magnitude of the target accuracy and the Polyak-Łojasiewicz (PL) constant if present. In addition, it achieves the best-available convergence rate for non-PL objectives simultaneously while outperforming existing algorithms for PL objectives.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"236 1","pages":"634-648"},"PeriodicalIF":0.0,"publicationDate":"2020-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76432428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Mean-Field Controls with Q-Learning for Cooperative MARL: Convergence and Complexity Analysis 协同MARL的q -学习平均域控制:收敛性和复杂性分析
SIAM journal on mathematics of data science Pub Date : 2020-02-10 DOI: 10.1137/20m1360700
Haotian Gu, Xin Guo, Xiaoli Wei, Renyuan Xu
{"title":"Mean-Field Controls with Q-Learning for Cooperative MARL: Convergence and Complexity Analysis","authors":"Haotian Gu, Xin Guo, Xiaoli Wei, Renyuan Xu","doi":"10.1137/20m1360700","DOIUrl":"https://doi.org/10.1137/20m1360700","url":null,"abstract":"Multi-agent reinforcement learning (MARL), despite its popularity and empirical success, suffers from the curse of dimensionality. This paper builds the mathematical framework to approximate cooperative MARL by a mean-field control (MFC) framework, and shows that the approximation error is of $O(frac{1}{sqrt{N}})$. By establishing appropriate form of the dynamic programming principle for both the value function and the Q function, it proposes a model-free kernel-based Q-learning algorithm (MFC-K-Q), which is shown to be of linear convergence rate, the first of its kind in the MARL literature. It further establishes that the convergence rate and the sample complexity of MFC-K-Q are independent of the number of agents $N$. Empirical studies for the network traffic congestion problem demonstrate that MFC-K-Q outperforms existing MARL algorithms when $N$ is large, for instance when $N>50$.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"20 1","pages":"1168-1196"},"PeriodicalIF":0.0,"publicationDate":"2020-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74556016","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 37
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信