{"title":"线性模型中随机梯度下降与放弃正则化的渐近性","authors":"Jiaqi Li, Johannes Schmidt-Hieber, Wei Biao Wu","doi":"arxiv-2409.07434","DOIUrl":null,"url":null,"abstract":"This paper proposes an asymptotic theory for online inference of the\nstochastic gradient descent (SGD) iterates with dropout regularization in\nlinear regression. Specifically, we establish the geometric-moment contraction\n(GMC) for constant step-size SGD dropout iterates to show the existence of a\nunique stationary distribution of the dropout recursive function. By the GMC\nproperty, we provide quenched central limit theorems (CLT) for the difference\nbetween dropout and $\\ell^2$-regularized iterates, regardless of\ninitialization. The CLT for the difference between the Ruppert-Polyak averaged\nSGD (ASGD) with dropout and $\\ell^2$-regularized iterates is also presented.\nBased on these asymptotic normality results, we further introduce an online\nestimator for the long-run covariance matrix of ASGD dropout to facilitate\ninference in a recursive manner with efficiency in computational time and\nmemory. The numerical experiments demonstrate that for sufficiently large\nsamples, the proposed confidence intervals for ASGD with dropout nearly achieve\nthe nominal coverage probability.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"48 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Asymptotics of Stochastic Gradient Descent with Dropout Regularization in Linear Models\",\"authors\":\"Jiaqi Li, Johannes Schmidt-Hieber, Wei Biao Wu\",\"doi\":\"arxiv-2409.07434\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper proposes an asymptotic theory for online inference of the\\nstochastic gradient descent (SGD) iterates with dropout regularization in\\nlinear regression. Specifically, we establish the geometric-moment contraction\\n(GMC) for constant step-size SGD dropout iterates to show the existence of a\\nunique stationary distribution of the dropout recursive function. By the GMC\\nproperty, we provide quenched central limit theorems (CLT) for the difference\\nbetween dropout and $\\\\ell^2$-regularized iterates, regardless of\\ninitialization. The CLT for the difference between the Ruppert-Polyak averaged\\nSGD (ASGD) with dropout and $\\\\ell^2$-regularized iterates is also presented.\\nBased on these asymptotic normality results, we further introduce an online\\nestimator for the long-run covariance matrix of ASGD dropout to facilitate\\ninference in a recursive manner with efficiency in computational time and\\nmemory. The numerical experiments demonstrate that for sufficiently large\\nsamples, the proposed confidence intervals for ASGD with dropout nearly achieve\\nthe nominal coverage probability.\",\"PeriodicalId\":501379,\"journal\":{\"name\":\"arXiv - STAT - Statistics Theory\",\"volume\":\"48 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - STAT - Statistics Theory\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.07434\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - STAT - Statistics Theory","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.07434","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Asymptotics of Stochastic Gradient Descent with Dropout Regularization in Linear Models
This paper proposes an asymptotic theory for online inference of the
stochastic gradient descent (SGD) iterates with dropout regularization in
linear regression. Specifically, we establish the geometric-moment contraction
(GMC) for constant step-size SGD dropout iterates to show the existence of a
unique stationary distribution of the dropout recursive function. By the GMC
property, we provide quenched central limit theorems (CLT) for the difference
between dropout and $\ell^2$-regularized iterates, regardless of
initialization. The CLT for the difference between the Ruppert-Polyak averaged
SGD (ASGD) with dropout and $\ell^2$-regularized iterates is also presented.
Based on these asymptotic normality results, we further introduce an online
estimator for the long-run covariance matrix of ASGD dropout to facilitate
inference in a recursive manner with efficiency in computational time and
memory. The numerical experiments demonstrate that for sufficiently large
samples, the proposed confidence intervals for ASGD with dropout nearly achieve
the nominal coverage probability.