线性模型中随机梯度下降与放弃正则化的渐近性

Jiaqi Li, Johannes Schmidt-Hieber, Wei Biao Wu
{"title":"线性模型中随机梯度下降与放弃正则化的渐近性","authors":"Jiaqi Li, Johannes Schmidt-Hieber, Wei Biao Wu","doi":"arxiv-2409.07434","DOIUrl":null,"url":null,"abstract":"This paper proposes an asymptotic theory for online inference of the\nstochastic gradient descent (SGD) iterates with dropout regularization in\nlinear regression. Specifically, we establish the geometric-moment contraction\n(GMC) for constant step-size SGD dropout iterates to show the existence of a\nunique stationary distribution of the dropout recursive function. By the GMC\nproperty, we provide quenched central limit theorems (CLT) for the difference\nbetween dropout and $\\ell^2$-regularized iterates, regardless of\ninitialization. The CLT for the difference between the Ruppert-Polyak averaged\nSGD (ASGD) with dropout and $\\ell^2$-regularized iterates is also presented.\nBased on these asymptotic normality results, we further introduce an online\nestimator for the long-run covariance matrix of ASGD dropout to facilitate\ninference in a recursive manner with efficiency in computational time and\nmemory. The numerical experiments demonstrate that for sufficiently large\nsamples, the proposed confidence intervals for ASGD with dropout nearly achieve\nthe nominal coverage probability.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"48 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Asymptotics of Stochastic Gradient Descent with Dropout Regularization in Linear Models\",\"authors\":\"Jiaqi Li, Johannes Schmidt-Hieber, Wei Biao Wu\",\"doi\":\"arxiv-2409.07434\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper proposes an asymptotic theory for online inference of the\\nstochastic gradient descent (SGD) iterates with dropout regularization in\\nlinear regression. Specifically, we establish the geometric-moment contraction\\n(GMC) for constant step-size SGD dropout iterates to show the existence of a\\nunique stationary distribution of the dropout recursive function. By the GMC\\nproperty, we provide quenched central limit theorems (CLT) for the difference\\nbetween dropout and $\\\\ell^2$-regularized iterates, regardless of\\ninitialization. The CLT for the difference between the Ruppert-Polyak averaged\\nSGD (ASGD) with dropout and $\\\\ell^2$-regularized iterates is also presented.\\nBased on these asymptotic normality results, we further introduce an online\\nestimator for the long-run covariance matrix of ASGD dropout to facilitate\\ninference in a recursive manner with efficiency in computational time and\\nmemory. The numerical experiments demonstrate that for sufficiently large\\nsamples, the proposed confidence intervals for ASGD with dropout nearly achieve\\nthe nominal coverage probability.\",\"PeriodicalId\":501379,\"journal\":{\"name\":\"arXiv - STAT - Statistics Theory\",\"volume\":\"48 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - STAT - Statistics Theory\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.07434\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - STAT - Statistics Theory","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.07434","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

本文提出了在线推断线性回归中带有滤除正则化的随机梯度下降(SGD)迭代的渐近理论。具体地说,我们建立了恒定步长 SGD 剔除迭代的几何量收缩(GMC),以证明剔除递归函数存在唯一的静态分布。根据 GMC 属性,我们为 dropout 和 $\ell^2$ 规则化迭代之间的差异提供了淬火中心极限定理(CLT),而与初始化无关。在这些渐近正态性结果的基础上,我们进一步引入了一个 ASGD dropout 长期协方差矩阵的在线估计器,以便于以递归方式进行推断,同时提高计算时间和内存的效率。数值实验证明,对于足够大的样本,所提出的带剔除的 ASGD 置信区间几乎可以达到名义覆盖概率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Asymptotics of Stochastic Gradient Descent with Dropout Regularization in Linear Models
This paper proposes an asymptotic theory for online inference of the stochastic gradient descent (SGD) iterates with dropout regularization in linear regression. Specifically, we establish the geometric-moment contraction (GMC) for constant step-size SGD dropout iterates to show the existence of a unique stationary distribution of the dropout recursive function. By the GMC property, we provide quenched central limit theorems (CLT) for the difference between dropout and $\ell^2$-regularized iterates, regardless of initialization. The CLT for the difference between the Ruppert-Polyak averaged SGD (ASGD) with dropout and $\ell^2$-regularized iterates is also presented. Based on these asymptotic normality results, we further introduce an online estimator for the long-run covariance matrix of ASGD dropout to facilitate inference in a recursive manner with efficiency in computational time and memory. The numerical experiments demonstrate that for sufficiently large samples, the proposed confidence intervals for ASGD with dropout nearly achieve the nominal coverage probability.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信