Learning and generalization of one-hidden-layer neural networks, going beyond standard Gaussian data

2022 56th Annual Conference on Information Sciences and Systems (CISS) Pub Date : 2022-03-09 DOI:10.1109/CISS53076.2022.9751184

Hongkang Li, Shuai Zhang, M. Wang

引用次数: 3

Abstract

This paper analyzes the convergence and generalization of training a one-hidden-layer neural network when the input features follow the Gaussian mixture model consisting of a finite number of Gaussian distributions. Assuming the labels are generated from a teacher model with an unknown ground truth weight, the learning problem is to estimate the underlying teacher model by minimizing a non-convex risk function over a student neural network. With a finite number of training samples, referred to the sample complexity, the iterations are proved to converge linearly to a critical point with guaranteed generalization error. In addition, for the first time, this paper characterizes the impact of the input distributions on the sample complexity and the learning rate.

查看原文本刊更多论文

学习和泛化单隐层神经网络，超越标准高斯数据

本文分析了当输入特征遵循由有限个高斯分布组成的高斯混合模型时，单隐层神经网络训练的收敛性和泛化性。假设标签是由一个具有未知基础真值权重的教师模型生成的，学习问题是通过最小化学生神经网络上的非凸风险函数来估计潜在的教师模型。在训练样本数量有限的情况下，参考样本复杂度，证明迭代线性收敛到一个保证泛化误差的临界点。此外，本文还首次刻画了输入分布对样本复杂度和学习率的影响。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 56th Annual Conference on Information Sciences and Systems (CISS)

自引率

0.00%

发文量