Juliane Braunsmann, Marko Rajković, Benedikt Wirth, Martin Rumpf
{"title":"Convergent autoencoder approximation of low bending and low distortion manifold embeddings","authors":"Juliane Braunsmann, Marko Rajković, Benedikt Wirth, Martin Rumpf","doi":"10.1051/m2an/2023088","DOIUrl":null,"url":null,"abstract":"Autoencoders are widely used in machine learning for dimension reduction of high-dimensional data. The encoder embeds the input data manifold into a lower-dimensional latent space, while the decoder represents the inverse map, providing a parametrization of the data manifold by the manifold in latent space. We propose and analyze a novel regularization for learning the encoder component of an autoencoder: a loss functional that prefers isometric, extrinsically flat embeddings and allows to train the encoder on its own. To perform the training, it is assumed that the local Riemannian distance and the local Riemannian average can be evaluated for pairs of nearby points on the input manifold. The loss functional is computed via Monte Carlo integration. Our main theorem identifies a geometric loss functional of the embedding map as the $\\Gamma$-limit of the sampling-dependent loss functionals. Numerical tests, using image data that encodes different explicitly given data manifolds, show that smooth manifold embeddings into latent space are obtained. Due to the promotion of extrinsic flatness, interpolation between not too distant points on the manifold is well approximated by linear interpolation in latent space.","PeriodicalId":505020,"journal":{"name":"ESAIM: Mathematical Modelling and Numerical Analysis","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ESAIM: Mathematical Modelling and Numerical Analysis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1051/m2an/2023088","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Autoencoders are widely used in machine learning for dimension reduction of high-dimensional data. The encoder embeds the input data manifold into a lower-dimensional latent space, while the decoder represents the inverse map, providing a parametrization of the data manifold by the manifold in latent space. We propose and analyze a novel regularization for learning the encoder component of an autoencoder: a loss functional that prefers isometric, extrinsically flat embeddings and allows to train the encoder on its own. To perform the training, it is assumed that the local Riemannian distance and the local Riemannian average can be evaluated for pairs of nearby points on the input manifold. The loss functional is computed via Monte Carlo integration. Our main theorem identifies a geometric loss functional of the embedding map as the $\Gamma$-limit of the sampling-dependent loss functionals. Numerical tests, using image data that encodes different explicitly given data manifolds, show that smooth manifold embeddings into latent space are obtained. Due to the promotion of extrinsic flatness, interpolation between not too distant points on the manifold is well approximated by linear interpolation in latent space.