Stable tensor neural networks for efficient deep learning.

IF 2.4 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS

Frontiers in Big Data Pub Date : 2024-05-30 eCollection Date: 2024-01-01 DOI:10.3389/fdata.2024.1363978

Elizabeth Newman, Lior Horesh, Haim Avron, Misha E Kilmer

{"title":"Stable tensor neural networks for efficient deep learning.","authors":"Elizabeth Newman, Lior Horesh, Haim Avron, Misha E Kilmer","doi":"10.3389/fdata.2024.1363978","DOIUrl":null,"url":null,"abstract":"<p><p>Learning from complex, multidimensional data has become central to computational mathematics, and among the most successful high-dimensional function approximators are deep neural networks (DNNs). Training DNNs is posed as an optimization problem to learn network weights or parameters that well-approximate a mapping from input to target data. Multiway data or tensors arise naturally in myriad ways in deep learning, in particular as input data and as high-dimensional weights and features extracted by the network, with the latter often being a bottleneck in terms of speed and memory. In this work, we leverage tensor representations and processing to efficiently parameterize DNNs when learning from high-dimensional data. We propose tensor neural networks (t-NNs), a natural extension of traditional fully-connected networks, that can be trained efficiently in a reduced, yet more powerful parameter space. Our t-NNs are built upon matrix-mimetic tensor-tensor products, which retain algebraic properties of matrix multiplication while capturing high-dimensional correlations. Mimeticity enables t-NNs to inherit desirable properties of modern DNN architectures. We exemplify this by extending recent work on stable neural networks, which interpret DNNs as discretizations of differential equations, to our multidimensional framework. We provide empirical evidence of the parametric advantages of t-NNs on dimensionality reduction using autoencoders and classification using fully-connected and stable variants on benchmark imaging datasets MNIST and CIFAR-10.</p>","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"7 ","pages":"1363978"},"PeriodicalIF":2.4000,"publicationDate":"2024-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11170703/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Big Data","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/fdata.2024.1363978","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Learning from complex, multidimensional data has become central to computational mathematics, and among the most successful high-dimensional function approximators are deep neural networks (DNNs). Training DNNs is posed as an optimization problem to learn network weights or parameters that well-approximate a mapping from input to target data. Multiway data or tensors arise naturally in myriad ways in deep learning, in particular as input data and as high-dimensional weights and features extracted by the network, with the latter often being a bottleneck in terms of speed and memory. In this work, we leverage tensor representations and processing to efficiently parameterize DNNs when learning from high-dimensional data. We propose tensor neural networks (t-NNs), a natural extension of traditional fully-connected networks, that can be trained efficiently in a reduced, yet more powerful parameter space. Our t-NNs are built upon matrix-mimetic tensor-tensor products, which retain algebraic properties of matrix multiplication while capturing high-dimensional correlations. Mimeticity enables t-NNs to inherit desirable properties of modern DNN architectures. We exemplify this by extending recent work on stable neural networks, which interpret DNNs as discretizations of differential equations, to our multidimensional framework. We provide empirical evidence of the parametric advantages of t-NNs on dimensionality reduction using autoencoders and classification using fully-connected and stable variants on benchmark imaging datasets MNIST and CIFAR-10.

查看原文本刊更多论文

用于高效深度学习的稳定张量神经网络

从复杂的多维数据中学习已成为计算数学的核心，其中最成功的高维函数近似器是深度神经网络（DNN）。训练 DNNs 是一个优化问题，即学习网络权重或参数，以很好地逼近从输入数据到目标数据的映射。多路数据或张量在深度学习中自然会以各种方式出现，特别是作为输入数据和网络提取的高维权重和特征，而后者往往是速度和内存方面的瓶颈。在这项工作中，我们利用张量表示和处理，在从高维数据中学习时高效地为 DNN 设置参数。我们提出的张量神经网络（t-NNs）是传统全连接网络的自然扩展，可以在更小但功能更强大的参数空间内进行高效训练。我们的 t-NN 建立在矩阵仿真张量-张量乘积的基础上，既保留了矩阵乘法的代数特性，又捕捉到了高维相关性。模仿性使 t-NNs 能够继承现代 DNN 架构的理想特性。我们将最近关于稳定神经网络的研究成果（将 DNN 解释为微分方程的离散化）扩展到我们的多维框架，以此为例进行说明。我们在基准成像数据集 MNIST 和 CIFAR-10 上使用自动编码器降低维度，并使用全连接和稳定变体进行分类，从而提供了 t-NNs 在参数方面优势的经验证据。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊