{"title":"损失函数不敏感的稀疏在线回归算法","authors":"Ting Hu , Jing Xiong","doi":"10.1016/j.jmva.2024.105316","DOIUrl":null,"url":null,"abstract":"<div><p>Online learning is an efficient approach in machine learning and statistics, which iteratively updates models upon the observation of a sequence of training examples. A representative online learning algorithm is the online gradient descent, which has found wide applications due to its low complexity and scalability to large datasets. Kernel-based learning methods have been proven to be quite successful in dealing with nonlinearity in the data and multivariate optimization. In this paper we present a class of kernel-based online gradient descent algorithm for addressing regression problems, which generates sparse estimators in an iterative way to reduce the algorithmic complexity for training streaming datasets and model selection in large-scale learning scenarios. In the setting of support vector regression (SVR), we design the sparse online learning algorithm by introducing a sequence of insensitive distance-based loss functions. We prove consistency and error bounds quantifying the generalization performance of such algorithms under mild conditions. The theoretical results demonstrate the interplay between statistical accuracy and sparsity property during learning processes. We show that the insensitive parameter plays a crucial role in providing sparsity as well as fast convergence rates. The numerical experiments also support our theoretical results.</p></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":null,"pages":null},"PeriodicalIF":1.4000,"publicationDate":"2024-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Sparse online regression algorithm with insensitive loss functions\",\"authors\":\"Ting Hu , Jing Xiong\",\"doi\":\"10.1016/j.jmva.2024.105316\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Online learning is an efficient approach in machine learning and statistics, which iteratively updates models upon the observation of a sequence of training examples. A representative online learning algorithm is the online gradient descent, which has found wide applications due to its low complexity and scalability to large datasets. Kernel-based learning methods have been proven to be quite successful in dealing with nonlinearity in the data and multivariate optimization. In this paper we present a class of kernel-based online gradient descent algorithm for addressing regression problems, which generates sparse estimators in an iterative way to reduce the algorithmic complexity for training streaming datasets and model selection in large-scale learning scenarios. In the setting of support vector regression (SVR), we design the sparse online learning algorithm by introducing a sequence of insensitive distance-based loss functions. We prove consistency and error bounds quantifying the generalization performance of such algorithms under mild conditions. The theoretical results demonstrate the interplay between statistical accuracy and sparsity property during learning processes. We show that the insensitive parameter plays a crucial role in providing sparsity as well as fast convergence rates. The numerical experiments also support our theoretical results.</p></div>\",\"PeriodicalId\":16431,\"journal\":{\"name\":\"Journal of Multivariate Analysis\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":1.4000,\"publicationDate\":\"2024-04-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Multivariate Analysis\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0047259X2400023X\",\"RegionNum\":3,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"STATISTICS & PROBABILITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Multivariate Analysis","FirstCategoryId":"100","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0047259X2400023X","RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
Sparse online regression algorithm with insensitive loss functions
Online learning is an efficient approach in machine learning and statistics, which iteratively updates models upon the observation of a sequence of training examples. A representative online learning algorithm is the online gradient descent, which has found wide applications due to its low complexity and scalability to large datasets. Kernel-based learning methods have been proven to be quite successful in dealing with nonlinearity in the data and multivariate optimization. In this paper we present a class of kernel-based online gradient descent algorithm for addressing regression problems, which generates sparse estimators in an iterative way to reduce the algorithmic complexity for training streaming datasets and model selection in large-scale learning scenarios. In the setting of support vector regression (SVR), we design the sparse online learning algorithm by introducing a sequence of insensitive distance-based loss functions. We prove consistency and error bounds quantifying the generalization performance of such algorithms under mild conditions. The theoretical results demonstrate the interplay between statistical accuracy and sparsity property during learning processes. We show that the insensitive parameter plays a crucial role in providing sparsity as well as fast convergence rates. The numerical experiments also support our theoretical results.
期刊介绍:
Founded in 1971, the Journal of Multivariate Analysis (JMVA) is the central venue for the publication of new, relevant methodology and particularly innovative applications pertaining to the analysis and interpretation of multidimensional data.
The journal welcomes contributions to all aspects of multivariate data analysis and modeling, including cluster analysis, discriminant analysis, factor analysis, and multidimensional continuous or discrete distribution theory. Topics of current interest include, but are not limited to, inferential aspects of
Copula modeling
Functional data analysis
Graphical modeling
High-dimensional data analysis
Image analysis
Multivariate extreme-value theory
Sparse modeling
Spatial statistics.