Towards efficient representation identification in supervised learning

CLEaR Pub Date : 2022-04-10 DOI:10.48550/arXiv.2204.04606

Kartik Ahuja, Divyat Mahajan, Vasilis Syrgkanis, Ioannis Mitliagkas

{"title":"Towards efficient representation identification in supervised learning","authors":"Kartik Ahuja, Divyat Mahajan, Vasilis Syrgkanis, Ioannis Mitliagkas","doi":"10.48550/arXiv.2204.04606","DOIUrl":null,"url":null,"abstract":"Humans have a remarkable ability to disentangle complex sensory inputs (e.g., image, text) into simple factors of variation (e.g., shape, color) without much supervision. This ability has inspired many works that attempt to solve the following question: how do we invert the data generation process to extract those factors with minimal or no supervision? Several works in the literature on non-linear independent component analysis have established this negative result; without some knowledge of the data generation process or appropriate inductive biases, it is impossible to perform this inversion. In recent years, a lot of progress has been made on disentanglement under structural assumptions, e.g., when we have access to auxiliary information that makes the factors of variation conditionally independent. However, existing work requires a lot of auxiliary information, e.g., in supervised classification, it prescribes that the number of label classes should be at least equal to the total dimension of all factors of variation. In this work, we depart from these assumptions and ask: a) How can we get disentanglement when the auxiliary information does not provide conditional independence over the factors of variation? b) Can we reduce the amount of auxiliary information required for disentanglement? For a class of models where auxiliary information does not ensure conditional independence, we show theoretically and experimentally that disentanglement (to a large extent) is possible even when the auxiliary information dimension is much less than the dimension of the true latent representation.","PeriodicalId":171742,"journal":{"name":"CLEaR","volume":"486 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"CLEaR","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2204.04606","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 10

Abstract

Humans have a remarkable ability to disentangle complex sensory inputs (e.g., image, text) into simple factors of variation (e.g., shape, color) without much supervision. This ability has inspired many works that attempt to solve the following question: how do we invert the data generation process to extract those factors with minimal or no supervision? Several works in the literature on non-linear independent component analysis have established this negative result; without some knowledge of the data generation process or appropriate inductive biases, it is impossible to perform this inversion. In recent years, a lot of progress has been made on disentanglement under structural assumptions, e.g., when we have access to auxiliary information that makes the factors of variation conditionally independent. However, existing work requires a lot of auxiliary information, e.g., in supervised classification, it prescribes that the number of label classes should be at least equal to the total dimension of all factors of variation. In this work, we depart from these assumptions and ask: a) How can we get disentanglement when the auxiliary information does not provide conditional independence over the factors of variation? b) Can we reduce the amount of auxiliary information required for disentanglement? For a class of models where auxiliary information does not ensure conditional independence, we show theoretically and experimentally that disentanglement (to a large extent) is possible even when the auxiliary information dimension is much less than the dimension of the true latent representation.

查看原文本刊更多论文

监督学习中高效表征识别的研究

人类有一种非凡的能力，可以在没有太多监督的情况下，将复杂的感官输入(例如，图像、文本)分解为简单的变化因素(例如，形状、颜色)。这种能力激发了许多试图解决以下问题的作品:我们如何在最小或没有监督的情况下反转数据生成过程以提取这些因素?非线性独立分量分析的一些文献已经建立了这个否定的结果;如果不了解数据生成过程或适当的归纳偏差，则不可能执行此反转。近年来，在结构假设下的解纠缠方面取得了很大进展，例如，当我们获得使变异因素条件独立的辅助信息时。然而，现有的工作需要大量的辅助信息，如在监督分类中，它规定标签类的数量至少等于所有变异因素的总维数。在这项工作中，我们偏离了这些假设，并提出了以下问题:a)当辅助信息不提供对变异因素的条件独立性时，我们如何获得解纠缠?b)我们能否减少解缠所需的辅助信息量?对于一类辅助信息不保证条件独立性的模型，我们从理论上和实验上表明，即使辅助信息维度远小于真实潜在表征的维度，解纠缠(在很大程度上)也是可能的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

CLEaR

自引率

0.00%

发文量