Deep Neural Networks: Multi-Classification and Universal Approximation

arXiv - STAT - Machine Learning Pub Date : 2024-09-10 DOI:arxiv-2409.06555

Martín Hernández, Enrique Zuazua

引用次数: 0

Abstract

We demonstrate that a ReLU deep neural network with a width of $2$ and a depth of $2N+4M-1$ layers can achieve finite sample memorization for any dataset comprising $N$ elements in $\mathbb{R}^d$, where $d\ge1,$ and $M$ classes, thereby ensuring accurate classification. By modeling the neural network as a time-discrete nonlinear dynamical system, we interpret the memorization property as a problem of simultaneous or ensemble controllability. This problem is addressed by constructing the network parameters inductively and explicitly, bypassing the need for training or solving any optimization problem. Additionally, we establish that such a network can achieve universal approximation in $L^p(\Omega;\mathbb{R}_+)$, where $\Omega$ is a bounded subset of $\mathbb{R}^d$ and $p\in[1,\infty)$, using a ReLU deep neural network with a width of $d+1$. We also provide depth estimates for approximating $W^{1,p}$ functions and width estimates for approximating $L^p(\Omega;\mathbb{R}^m)$ for $m\geq1$. Our proofs are constructive, offering explicit values for the biases and weights involved.

查看原文本刊更多论文

深度神经网络：多分类和通用逼近

我们证明了一个宽度为 2 美元、深度为 2N+4M-1 美元层的 ReLU 深度神经网络可以对任何由 $\mathbb{R}^d$ 中的 $N$ 元素（其中 $d\ge1,$ 和 $M$ 类）组成的数据集实现有限样本记忆，从而确保准确分类。通过将神经网络建模为一个时间离散的非线性动态系统，我们将记忆特性解释为同步或集合可控性问题。解决这个问题的方法是通过归纳和显式构建网络参数，从而绕过了训练或解决任何优化问题的需要。此外，我们还利用带宽为 $d+1$ 的 ReLU 深度神经网络，确定这种网络可以在 $L^p(\Omega;\mathbb{R}_+)$（其中 $Omega$ 是 $\mathbb{R}^d$ 的有界子集，且 $p/in[1,\infty)$ 中实现通用逼近。我们还提供了近似$W^{1,p}$函数的深度估计值，以及近似$m\geq1$的$L^p(\Omega;\mathbb{R}^m)$的宽度估计值。我们的证明是建设性的，为所涉及的偏差和权重提供了明确的数值。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

arXiv - STAT - Machine Learning

自引率

0.00%

发文量