{"title":"芒果带分组算子的离散图像变换漫场","authors":"Brighton Ancelin, Yenho Chen, Peimeng Guan, Chiraag Kaushik, Belen Martin-Urcelay, Alex Saad-Falcon, Nakul Singh","doi":"arxiv-2409.09542","DOIUrl":null,"url":null,"abstract":"Learning semantically meaningful image transformations (i.e. rotation,\nthickness, blur) directly from examples can be a challenging task. Recently,\nthe Manifold Autoencoder (MAE) proposed using a set of Lie group operators to\nlearn image transformations directly from examples. However, this approach has\nlimitations, as the learned operators are not guaranteed to be disentangled and\nthe training routine is prohibitively expensive when scaling up the model. To\naddress these limitations, we propose MANGO (transformation Manifolds with\nGrouped Operators) for learning disentangled operators that describe image\ntransformations in distinct latent subspaces. Moreover, our approach allows\npractitioners the ability to define which transformations they aim to model,\nthus improving the semantic meaning of the learned operators. Through our\nexperiments, we demonstrate that MANGO enables composition of image\ntransformations and introduces a one-phase training routine that leads to a\n100x speedup over prior works.","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"MANGO: Disentangled Image Transformation Manifolds with Grouped Operators\",\"authors\":\"Brighton Ancelin, Yenho Chen, Peimeng Guan, Chiraag Kaushik, Belen Martin-Urcelay, Alex Saad-Falcon, Nakul Singh\",\"doi\":\"arxiv-2409.09542\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Learning semantically meaningful image transformations (i.e. rotation,\\nthickness, blur) directly from examples can be a challenging task. Recently,\\nthe Manifold Autoencoder (MAE) proposed using a set of Lie group operators to\\nlearn image transformations directly from examples. However, this approach has\\nlimitations, as the learned operators are not guaranteed to be disentangled and\\nthe training routine is prohibitively expensive when scaling up the model. To\\naddress these limitations, we propose MANGO (transformation Manifolds with\\nGrouped Operators) for learning disentangled operators that describe image\\ntransformations in distinct latent subspaces. Moreover, our approach allows\\npractitioners the ability to define which transformations they aim to model,\\nthus improving the semantic meaning of the learned operators. Through our\\nexperiments, we demonstrate that MANGO enables composition of image\\ntransformations and introduces a one-phase training routine that leads to a\\n100x speedup over prior works.\",\"PeriodicalId\":501289,\"journal\":{\"name\":\"arXiv - EE - Image and Video Processing\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - EE - Image and Video Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.09542\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - EE - Image and Video Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.09542","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
MANGO: Disentangled Image Transformation Manifolds with Grouped Operators
Learning semantically meaningful image transformations (i.e. rotation,
thickness, blur) directly from examples can be a challenging task. Recently,
the Manifold Autoencoder (MAE) proposed using a set of Lie group operators to
learn image transformations directly from examples. However, this approach has
limitations, as the learned operators are not guaranteed to be disentangled and
the training routine is prohibitively expensive when scaling up the model. To
address these limitations, we propose MANGO (transformation Manifolds with
Grouped Operators) for learning disentangled operators that describe image
transformations in distinct latent subspaces. Moreover, our approach allows
practitioners the ability to define which transformations they aim to model,
thus improving the semantic meaning of the learned operators. Through our
experiments, we demonstrate that MANGO enables composition of image
transformations and introduces a one-phase training routine that leads to a
100x speedup over prior works.