Minibatch and local SGD: Algorithmic stability and linear speedup in generalization

IF 3.2 2区数学 Q1 MATHEMATICS, APPLIED

Applied and Computational Harmonic Analysis Pub Date : 2025-07-16 DOI:10.1016/j.acha.2025.101795

Yunwen Lei , Tao Sun , Mingrui Liu

引用次数: 0

Abstract

The increasing scale of data propels the popularity of leveraging parallelism to speed up the optimization. Minibatch stochastic gradient descent (minibatch SGD) and local SGD are two popular methods for parallel optimization. The existing theoretical studies show a linear speedup of these methods with respect to the number of machines, which, however, is measured by optimization errors in a multi-pass setting. As a comparison, the stability and generalization of these methods are much less studied. In this paper, we study the stability and generalization analysis of minibatch and local SGD to understand their learnability by introducing an expectation-variance decomposition. We incorporate training errors into the stability analysis, which shows how small training errors help generalization for overparameterized models. We show minibatch and local SGD achieve a linear speedup to attain the optimal risk bounds.

查看原文本刊更多论文

小批量和局部SGD：算法的稳定性和泛化的线性加速

不断增长的数据规模推动了利用并行性来加速优化的普及。Minibatch stochastic gradient descent （Minibatch SGD）和local SGD是两种比较流行的并行优化方法。现有的理论研究表明，这些方法的线性加速与机器数量有关，然而，这是通过多通道设置中的优化误差来衡量的。相比之下，对这些方法的稳定性和通用性的研究却很少。本文通过引入期望-方差分解，研究了小批量和局部SGD的稳定性和泛化分析，以了解它们的可学习性。我们将训练误差纳入稳定性分析，这表明小的训练误差如何有助于过度参数化模型的泛化。我们证明了小批量和局部SGD实现线性加速以达到最优风险界。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Applied and Computational Harmonic Analysis 物理-物理：数学物理

CiteScore

5.40

自引率

4.00%

发文量

审稿时长

22.9 weeks

期刊介绍： Applied and Computational Harmonic Analysis (ACHA) is an interdisciplinary journal that publishes high-quality papers in all areas of mathematical sciences related to the applied and computational aspects of harmonic analysis, with special emphasis on innovative theoretical development, methods, and algorithms, for information processing, manipulation, understanding, and so forth. The objectives of the journal are to chronicle the important publications in the rapidly growing field of data representation and analysis, to stimulate research in relevant interdisciplinary areas, and to provide a common link among mathematical, physical, and life scientists, as well as engineers.