Convergent autoencoder approximation of low bending and low distortion manifold embeddings

ESAIM: Mathematical Modelling and Numerical Analysis Pub Date : 2023-11-16 DOI:10.1051/m2an/2023088

Juliane Braunsmann, Marko Rajković, Benedikt Wirth, Martin Rumpf

{"title":"Convergent autoencoder approximation of low bending and low distortion manifold embeddings","authors":"Juliane Braunsmann, Marko Rajković, Benedikt Wirth, Martin Rumpf","doi":"10.1051/m2an/2023088","DOIUrl":null,"url":null,"abstract":"Autoencoders are widely used in machine learning for dimension reduction of high-dimensional data. The encoder embeds the input data manifold into a lower-dimensional latent space, while the decoder represents the inverse map, providing a parametrization of the data manifold by the manifold in latent space. We propose and analyze a novel regularization for learning the encoder component of an autoencoder: a loss functional that prefers isometric, extrinsically flat embeddings and allows to train the encoder on its own. To perform the training, it is assumed that the local Riemannian distance and the local Riemannian average can be evaluated for pairs of nearby points on the input manifold. The loss functional is computed via Monte Carlo integration. Our main theorem identifies a geometric loss functional of the embedding map as the $\\Gamma$-limit of the sampling-dependent loss functionals. Numerical tests, using image data that encodes different explicitly given data manifolds, show that smooth manifold embeddings into latent space are obtained. Due to the promotion of extrinsic flatness, interpolation between not too distant points on the manifold is well approximated by linear interpolation in latent space.","PeriodicalId":505020,"journal":{"name":"ESAIM: Mathematical Modelling and Numerical Analysis","volume":"28 2","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ESAIM: Mathematical Modelling and Numerical Analysis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1051/m2an/2023088","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Autoencoders are widely used in machine learning for dimension reduction of high-dimensional data. The encoder embeds the input data manifold into a lower-dimensional latent space, while the decoder represents the inverse map, providing a parametrization of the data manifold by the manifold in latent space. We propose and analyze a novel regularization for learning the encoder component of an autoencoder: a loss functional that prefers isometric, extrinsically flat embeddings and allows to train the encoder on its own. To perform the training, it is assumed that the local Riemannian distance and the local Riemannian average can be evaluated for pairs of nearby points on the input manifold. The loss functional is computed via Monte Carlo integration. Our main theorem identifies a geometric loss functional of the embedding map as the $\Gamma$-limit of the sampling-dependent loss functionals. Numerical tests, using image data that encodes different explicitly given data manifolds, show that smooth manifold embeddings into latent space are obtained. Due to the promotion of extrinsic flatness, interpolation between not too distant points on the manifold is well approximated by linear interpolation in latent space.

查看原文本刊更多论文

低弯曲和低失真流形嵌入的收敛自动编码器近似

自动编码器被广泛应用于机器学习中的高维数据降维。编码器将输入数据流形嵌入低维潜在空间，而解码器则表示逆映射，通过潜在空间中的流形对数据流形进行参数化。我们提出并分析了一种新的正则化方法，用于学习自动编码器的编码器组件：一种损失函数，它偏好等距、外扁平嵌入，并允许自行训练编码器。为了进行训练，假定可以评估输入流形上附近点对的局部黎曼距离和局部黎曼平均值。损失函数通过蒙特卡洛积分计算得出。我们的主要定理将嵌入图的几何损失函数确定为采样相关损失函数的 $\Gamma$ 极限。使用编码不同明确给定数据流形的图像数据进行的数值测试表明，可以获得平滑流形嵌入潜空间的结果。由于促进了外在平坦性，流形上不太远的点之间的插值可以很好地通过潜空间中的线性插值来近似。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ESAIM: Mathematical Modelling and Numerical Analysis

CiteScore

3.00

自引率

0.00%

发文量