SIAM journal on mathematics of data science最新文献

筛选
英文 中文
Generalization error of minimum weighted norm and kernel interpolation 最小加权范数与核插值的泛化误差
SIAM journal on mathematics of data science Pub Date : 2020-08-07 DOI: 10.1137/20M1359912
Weilin Li
{"title":"Generalization error of minimum weighted norm and kernel interpolation","authors":"Weilin Li","doi":"10.1137/20M1359912","DOIUrl":"https://doi.org/10.1137/20M1359912","url":null,"abstract":"We study the generalization error of functions that interpolate prescribed data points and are selected by minimizing a weighted norm. Under natural and general conditions, we prove that both the interpolants and their generalization errors converge as the number of parameters grow, and the limiting interpolant belongs to a reproducing kernel Hilbert space. This rigorously establishes an implicit bias of minimum weighted norm interpolation and explains why norm minimization may benefit from over-parameterization. As special cases of this theory, we study interpolation by trigonometric polynomials and spherical harmonics. Our approach is from a deterministic and approximation theory viewpoint, as opposed a statistical or random matrix one.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"199 1","pages":"414-438"},"PeriodicalIF":0.0,"publicationDate":"2020-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78571762","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Normal-bundle Bootstrap 法丛引导
SIAM journal on mathematics of data science Pub Date : 2020-07-27 DOI: 10.1137/20M1356002
Ruda Zhang, R. Ghanem
{"title":"Normal-bundle Bootstrap","authors":"Ruda Zhang, R. Ghanem","doi":"10.1137/20M1356002","DOIUrl":"https://doi.org/10.1137/20M1356002","url":null,"abstract":"Probabilistic models of data sets often exhibit salient geometric structure. Such a phenomenon is summed up in the manifold distribution hypothesis, and can be exploited in probabilistic learning. Here we present normal-bundle bootstrap (NBB), a method that generates new data which preserve the geometric structure of a given data set. Inspired by algorithms for manifold learning and concepts in differential geometry, our method decomposes the underlying probability measure into a marginalized measure on a learned data manifold and conditional measures on the normal spaces. The algorithm estimates the data manifold as a density ridge, and constructs new data by bootstrapping projection vectors and adding them to the ridge. We apply our method to the inference of density ridge and related statistics, and data augmentation to reduce overfitting.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"126 1","pages":"573-592"},"PeriodicalIF":0.0,"publicationDate":"2020-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80009795","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Train Like a (Var)Pro: Efficient Training of Neural Networks with Variable Projection 像(Var)Pro一样训练:具有可变投影的神经网络的有效训练
SIAM journal on mathematics of data science Pub Date : 2020-07-26 DOI: 10.1137/20m1359511
Elizabeth Newman, Lars Ruthotto, Joseph L. Hart, B. V. B. Waanders
{"title":"Train Like a (Var)Pro: Efficient Training of Neural Networks with Variable Projection","authors":"Elizabeth Newman, Lars Ruthotto, Joseph L. Hart, B. V. B. Waanders","doi":"10.1137/20m1359511","DOIUrl":"https://doi.org/10.1137/20m1359511","url":null,"abstract":"Deep neural networks (DNNs) have achieved state-of-the-art performance across a variety of traditional machine learning tasks, e.g., speech recognition, image classification, and segmentation. The ability of DNNs to efficiently approximate high-dimensional functions has also motivated their use in scientific applications, e.g., to solve partial differential equations (PDE) and to generate surrogate models. In this paper, we consider the supervised training of DNNs, which arises in many of the above applications. We focus on the central problem of optimizing the weights of the given DNN such that it accurately approximates the relation between observed input and target data. Devising effective solvers for this optimization problem is notoriously challenging due to the large number of weights, non-convexity, data-sparsity, and non-trivial choice of hyperparameters. To solve the optimization problem more efficiently, we propose the use of variable projection (VarPro), a method originally designed for separable nonlinear least-squares problems. Our main contribution is the Gauss-Newton VarPro method (GNvpro) that extends the reach of the VarPro idea to non-quadratic objective functions, most notably, cross-entropy loss functions arising in classification. These extensions make GNvpro applicable to all training problems that involve a DNN whose last layer is an affine mapping, which is common in many state-of-the-art architectures. In numerical experiments from classification and surrogate modeling, GNvpro not only solves the optimization problem more efficiently but also yields DNNs that generalize better than commonly-used optimization schemes.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"10 2 1","pages":"1041-1066"},"PeriodicalIF":0.0,"publicationDate":"2020-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81647654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
EnResNet: ResNets Ensemble via the Feynman-Kac Formalism for Adversarial Defense and Beyond EnResNet:基于费曼-卡茨形式主义的对抗防御及其后续的ResNets集成
SIAM journal on mathematics of data science Pub Date : 2020-07-13 DOI: 10.1137/19m1265302
Bao Wang, Binjie Yuan, Zuoqiang Shi, S. Osher
{"title":"EnResNet: ResNets Ensemble via the Feynman-Kac Formalism for Adversarial Defense and Beyond","authors":"Bao Wang, Binjie Yuan, Zuoqiang Shi, S. Osher","doi":"10.1137/19m1265302","DOIUrl":"https://doi.org/10.1137/19m1265302","url":null,"abstract":"Empirical adversarial risk minimization is a widely used mathematical framework to robustly train deep neural nets that are resistant to adversarial attacks. However, both natural and robust accura...","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"96 1","pages":"559-582"},"PeriodicalIF":0.0,"publicationDate":"2020-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77529902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
A Performance Guarantee for Spectral Clustering 谱聚类的性能保证
SIAM journal on mathematics of data science Pub Date : 2020-07-10 DOI: 10.1137/20M1352193
M. Boedihardjo, Shaofeng Deng, T. Strohmer
{"title":"A Performance Guarantee for Spectral Clustering","authors":"M. Boedihardjo, Shaofeng Deng, T. Strohmer","doi":"10.1137/20M1352193","DOIUrl":"https://doi.org/10.1137/20M1352193","url":null,"abstract":"The two-step spectral clustering method, which consists of the Laplacian eigenmap and a rounding step, is a widely used method for graph partitioning. It can be seen as a natural relaxation to the NP-hard minimum ratio cut problem. In this paper we study the central question: when is spectral clustering able to find the global solution to the minimum ratio cut problem? First we provide a condition that naturally depends on the intra- and inter-cluster connectivities of a given partition under which we may certify that this partition is the solution to the minimum ratio cut problem. Then we develop a deterministic two-to-infinity norm perturbation bound for the the invariant subspace of the graph Laplacian that corresponds to the $k$ smallest eigenvalues. Finally by combining these two results we give a condition under which spectral clustering is guaranteed to output the global solution to the minimum ratio cut problem, which serves as a performance guarantee for spectral clustering.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"2013 1","pages":"369-387"},"PeriodicalIF":0.0,"publicationDate":"2020-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87731639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Semi-supervised Learning for Aggregated Multilayer Graphs Using Diffuse Interface Methods and Fast Matrix-Vector Products 基于扩散接口方法和快速矩阵向量积的聚合多层图半监督学习
SIAM journal on mathematics of data science Pub Date : 2020-07-10 DOI: 10.1137/20M1352028
Kai Bergermann, M. Stoll, Toni Volkmer
{"title":"Semi-supervised Learning for Aggregated Multilayer Graphs Using Diffuse Interface Methods and Fast Matrix-Vector Products","authors":"Kai Bergermann, M. Stoll, Toni Volkmer","doi":"10.1137/20M1352028","DOIUrl":"https://doi.org/10.1137/20M1352028","url":null,"abstract":"We generalize a graph-based multiclass semi-supervised classification technique based on diffuse interface methods to multilayer graphs. Besides the treatment of various applications with an inherent multilayer structure, we present a very flexible approach that interprets high-dimensional data in a low-dimensional multilayer graph representation. Highly efficient numerical methods involving the spectral decomposition of the corresponding differential graph operators as well as fast matrix-vector products based on the nonequispaced fast Fourier transform (NFFT) enable the rapid treatment of large and high-dimensional data sets. We perform various numerical tests putting a special focus on image segmentation. In particular, we test the performance of our method on data sets with up to 10 million nodes per layer as well as up to 104 dimensions resulting in graphs with up to 52 layers. While all presented numerical experiments can be run on an average laptop computer, the linear dependence per iteration step of the runtime on the network size in all stages of our algorithm makes it scalable to even larger and higher-dimensional problems.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"80 1","pages":"758-785"},"PeriodicalIF":0.0,"publicationDate":"2020-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73724464","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Variational Representations and Neural Network Estimation of Rényi Divergences rassanyi散度的变分表示与神经网络估计
SIAM journal on mathematics of data science Pub Date : 2020-07-07 DOI: 10.1137/20m1368926
Jeremiah Birrell, P. Dupuis, M. Katsoulakis, L. Rey-Bellet, Jie Wang
{"title":"Variational Representations and Neural Network Estimation of Rényi Divergences","authors":"Jeremiah Birrell, P. Dupuis, M. Katsoulakis, L. Rey-Bellet, Jie Wang","doi":"10.1137/20m1368926","DOIUrl":"https://doi.org/10.1137/20m1368926","url":null,"abstract":"We derive a new variational formula for the R{e}nyi family of divergences, $R_alpha(Q|P)$, between probability measures $Q$ and $P$. Our result generalizes the classical Donsker-Varadhan variational formula for the Kullback-Leibler divergence. We further show that this R{e}nyi variational formula holds over a range of function spaces; this leads to a formula for the optimizer under very weak assumptions and is also key in our development of a consistency theory for R{e}nyi divergence estimators. By applying this theory to neural network estimators, we show that if a neural network family satisfies one of several strengthened versions of the universal approximation property then the corresponding R{e}nyi divergence estimator is consistent. In contrast to likelihood-ratio based methods, our estimators involve only expectations under $Q$ and $P$ and hence are more effective in high dimensional systems. We illustrate this via several numerical examples of neural network estimation in systems of up to 5000 dimensions.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"6 1","pages":"1093-1116"},"PeriodicalIF":0.0,"publicationDate":"2020-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80066195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
The Signature Kernel Is the Solution of a Goursat PDE 签名核是一个Goursat PDE的解
SIAM journal on mathematics of data science Pub Date : 2020-06-26 DOI: 10.1137/20M1366794
C. Salvi, Thomas Cass, J. Foster, Terry Lyons, Weixin Yang
{"title":"The Signature Kernel Is the Solution of a Goursat PDE","authors":"C. Salvi, Thomas Cass, J. Foster, Terry Lyons, Weixin Yang","doi":"10.1137/20M1366794","DOIUrl":"https://doi.org/10.1137/20M1366794","url":null,"abstract":"Recently, there has been an increased interest in the development of kernel methods for learning with sequential data. The signature kernel is a learning tool with potential to handle irregularly sampled, multivariate time series. In\"Kernels for sequentially ordered data\"the authors introduced a kernel trick for the truncated version of this kernel avoiding the exponential complexity that would have been involved in a direct computation. Here we show that for continuously differentiable paths, the signature kernel solves a hyperbolic PDE and recognize the connection with a well known class of differential equations known in the literature as Goursat problems. This Goursat PDE only depends on the increments of the input sequences, does not require the explicit computation of signatures and can be solved efficiently using state-of-the-arthyperbolic PDE numerical solvers, giving a kernel trick for the untruncated signature kernel, with the same raw complexity as the method from\"Kernels for sequentially ordered data\", but with the advantage that the PDE numerical scheme is well suited for GPU parallelization, which effectively reduces the complexity by a full order of magnitude in the length of the input sequences. In addition, we extend the previous analysis to the space of geometric rough paths and establish, using classical results from rough path theory, that the rough version of the signature kernel solves a rough integral equation analogous to the aforementioned Goursat PDE. Finally, we empirically demonstrate the effectiveness of our PDE kernel as a machine learning tool in various machine learning applications dealing with sequential data. We release the library sigkernel publicly available at https://github.com/crispitagorico/sigkernel.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"15 1","pages":"873-899"},"PeriodicalIF":0.0,"publicationDate":"2020-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80665343","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 32
Memory-Efficient Structured Convex Optimization via Extreme Point Sampling 基于极值点抽样的高效内存结构凸优化
SIAM journal on mathematics of data science Pub Date : 2020-06-19 DOI: 10.1137/20m1358037
Nimita Shinde, Vishnu Narayanan, J. Saunderson
{"title":"Memory-Efficient Structured Convex Optimization via Extreme Point Sampling","authors":"Nimita Shinde, Vishnu Narayanan, J. Saunderson","doi":"10.1137/20m1358037","DOIUrl":"https://doi.org/10.1137/20m1358037","url":null,"abstract":"Memory is a key computational bottleneck when solving large-scale convex optimization problems such as semidefinite programs (SDPs). In this paper, we focus on the regime in which storing an $ntimes n$ matrix decision variable is prohibitive. To solve SDPs in this regime, we develop a randomized algorithm that returns a random vector whose covariance matrix is near-feasible and near-optimal for the SDP. We show how to develop such an algorithm by modifying the Frank-Wolfe algorithm to systematically replace the matrix iterates with random vectors. As an application of this approach, we show how to implement the Goemans-Williamson approximation algorithm for textsc{MaxCut} using $mathcal{O}(n)$ memory in addition to the memory required to store the problem instance. We then extend our approach to deal with a broader range of structured convex optimization problems, replacing decision variables with random extreme points of the feasible region.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"51 1","pages":"787-814"},"PeriodicalIF":0.0,"publicationDate":"2020-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90346011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Two Steps at a Time---Taking GAN Training in Stride with Tseng's Method 一次两步——用曾氏方法进行GAN训练
SIAM journal on mathematics of data science Pub Date : 2020-06-16 DOI: 10.1137/21m1420939
A. Böhm, Michael Sedlmayer, E. R. Csetnek, R. Boț
{"title":"Two Steps at a Time---Taking GAN Training in Stride with Tseng's Method","authors":"A. Böhm, Michael Sedlmayer, E. R. Csetnek, R. Boț","doi":"10.1137/21m1420939","DOIUrl":"https://doi.org/10.1137/21m1420939","url":null,"abstract":"Motivated by the training of Generative Adversarial Networks (GANs), we study methods for solving minimax problems with additional nonsmooth regularizers. We do so by employing emph{monotone operator} theory, in particular the emph{Forward-Backward-Forward (FBF)} method, which avoids the known issue of limit cycling by correcting each update by a second gradient evaluation. Furthermore, we propose a seemingly new scheme which recycles old gradients to mitigate the additional computational cost. In doing so we rediscover a known method, related to emph{Optimistic Gradient Descent Ascent (OGDA)}. For both schemes we prove novel convergence rates for convex-concave minimax problems via a unifying approach. The derived error bounds are in terms of the gap function for the ergodic iterates. For the deterministic and the stochastic problem we show a convergence rate of $mathcal{O}(1/k)$ and $mathcal{O}(1/sqrt{k})$, respectively. We complement our theoretical results with empirical improvements in the training of Wasserstein GANs on the CIFAR10 dataset.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44505523","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信