Active Learning for Discrete Latent Variable Models

IF 2.7 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Computation Pub Date : 2024-02-16 DOI:10.1162/neco_a_01646

Aditi Jha;Zoe C. Ashwood;Jonathan W. Pillow

{"title":"Active Learning for Discrete Latent Variable Models","authors":"Aditi Jha;Zoe C. Ashwood;Jonathan W. Pillow","doi":"10.1162/neco_a_01646","DOIUrl":null,"url":null,"abstract":"Active learning seeks to reduce the amount of data required to fit the parameters of a model, thus forming an important class of techniques in modern machine learning. However, past work on active learning has largely overlooked latent variable models, which play a vital role in neuroscience, psychology, and a variety of other engineering and scientific disciplines. Here we address this gap by proposing a novel framework for maximum-mutual-information input selection for discrete latent variable regression models. We first apply our method to a class of models known as mixtures of linear regressions (MLR). While it is well known that active learning confers no advantage for linear-gaussian regression models, we use Fisher information to show analytically that active learning can nevertheless achieve large gains for mixtures of such models, and we validate this improvement using both simulations and real-world data. We then consider a powerful class of temporally structured latent variable models given by a hidden Markov model (HMM) with generalized linear model (GLM) observations, which has recently been used to identify discrete states from animal decision-making data. We show that our method substantially reduces the amount of data needed to fit GLM-HMMs and outperforms a variety of approximate methods based on variational and amortized inference. Infomax learning for latent variable models thus offers a powerful approach for characterizing temporally structured latent states, with a wide variety of applications in neuroscience and beyond.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 3","pages":"437-474"},"PeriodicalIF":2.7000,"publicationDate":"2024-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Computation","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10535097/","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Active learning seeks to reduce the amount of data required to fit the parameters of a model, thus forming an important class of techniques in modern machine learning. However, past work on active learning has largely overlooked latent variable models, which play a vital role in neuroscience, psychology, and a variety of other engineering and scientific disciplines. Here we address this gap by proposing a novel framework for maximum-mutual-information input selection for discrete latent variable regression models. We first apply our method to a class of models known as mixtures of linear regressions (MLR). While it is well known that active learning confers no advantage for linear-gaussian regression models, we use Fisher information to show analytically that active learning can nevertheless achieve large gains for mixtures of such models, and we validate this improvement using both simulations and real-world data. We then consider a powerful class of temporally structured latent variable models given by a hidden Markov model (HMM) with generalized linear model (GLM) observations, which has recently been used to identify discrete states from animal decision-making data. We show that our method substantially reduces the amount of data needed to fit GLM-HMMs and outperforms a variety of approximate methods based on variational and amortized inference. Infomax learning for latent variable models thus offers a powerful approach for characterizing temporally structured latent states, with a wide variety of applications in neuroscience and beyond.

查看原文本刊更多论文

离散潜变量模型的主动学习

主动学习旨在减少拟合模型参数所需的数据量，因此是现代机器学习的一类重要技术。然而，过去关于主动学习的研究在很大程度上忽视了潜变量模型，而潜变量模型在神经科学、心理学以及其他各种工程和科学学科中发挥着重要作用。在这里，我们提出了一个新框架，用于离散潜变量回归模型的最大相互信息输入选择，从而弥补了这一不足。我们首先将我们的方法应用于一类称为线性回归混合物（MLR）的模型。众所周知，主动学习对线性高斯回归模型没有优势，但我们利用费雪信息分析表明，主动学习仍能为这类模型的混合物带来巨大收益，我们还利用模拟和实际数据验证了这种改进。然后，我们考虑了一类强大的时间结构潜变量模型，该模型由具有广义线性模型（GLM）观测值的隐马尔可夫模型（HMM）给出，最近已被用于从动物决策数据中识别离散状态。我们的研究表明，我们的方法大大减少了拟合 GLM-HMM 所需的数据量，并且优于各种基于变异推理和摊销推理的近似方法。因此，潜变量模型的 Infomax 学习为描述时间结构的潜状态提供了一种强大的方法，在神经科学及其他领域有着广泛的应用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Neural Computation 工程技术-计算机：人工智能

CiteScore

6.30

自引率

3.40%

发文量

审稿时长

3.0 months

期刊介绍： Neural Computation is uniquely positioned at the crossroads between neuroscience and TMCS and welcomes the submission of original papers from all areas of TMCS, including: Advanced experimental design; Analysis of chemical sensor data; Connectomic reconstructions; Analysis of multielectrode and optical recordings; Genetic data for cell identity; Analysis of behavioral data; Multiscale models; Analysis of molecular mechanisms; Neuroinformatics; Analysis of brain imaging data; Neuromorphic engineering; Principles of neural coding, computation, circuit dynamics, and plasticity; Theories of brain function.