{"title":"Top-Down Priors Disambiguate Target and Distractor Features in Simulated Covert Visual Search","authors":"Justin D. Theiss;Michael A. Silver","doi":"10.1162/neco_a_01700","DOIUrl":"10.1162/neco_a_01700","url":null,"abstract":"Several models of visual search consider visual attention as part of a perceptual inference process, in which top-down priors disambiguate bottom-up sensory information. Many of these models have focused on gaze behavior, but there are relatively fewer models of covert spatial attention, in which attention is directed to a peripheral location in visual space without a shift in gaze direction. Here, we propose a biologically plausible model of covert attention during visual search that helps to bridge the gap between Bayesian modeling and neurophysiological modeling by using (1) top-down priors over target features that are acquired through Hebbian learning, and (2) spatial resampling of modeled cortical receptive fields to enhance local spatial resolution of image representations for downstream target classification. By training a simple generative model using a Hebbian update rule, top-down priors for target features naturally emerge without the need for hand-tuned or predetermined priors. Furthermore, the implementation of covert spatial attention in our model is based on a known neurobiological mechanism, providing a plausible process through which Bayesian priors could locally enhance the spatial resolution of image representations. We validate this model during simulated visual search for handwritten digits among nondigit distractors, demonstrating that top-down priors improve accuracy for estimation of target location and classification, relative to bottom-up signals alone. Our results support previous reports in the literature that demonstrated beneficial effects of top-down priors on visual search performance, while extending this literature to incorporate known neural mechanisms of covert spatial attention.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 10","pages":"2201-2224"},"PeriodicalIF":2.7,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141984036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ali Tehrani-Saleh;J. Devin McAuley;Christoph Adami
{"title":"Mechanism of Duration Perception in Artificial Brains Suggests New Model of Attentional Entrainment","authors":"Ali Tehrani-Saleh;J. Devin McAuley;Christoph Adami","doi":"10.1162/neco_a_01699","DOIUrl":"10.1162/neco_a_01699","url":null,"abstract":"While cognitive theory has advanced several candidate frameworks to explain attentional entrainment, the neural basis for the temporal allocation of attention is unknown. Here we present a new model of attentional entrainment guided by empirical evidence obtained using a cohort of 50 artificial brains. These brains were evolved in silico to perform a duration judgment task similar to one where human subjects perform duration judgments in auditory oddball paradigms. We found that the artificial brains display psychometric characteristics remarkably similar to those of human listeners and exhibit similar patterns of distortions of perception when presented with out-of-rhythm oddballs. A detailed analysis of mechanisms behind the duration distortion suggests that attention peaks at the end of the tone, which is inconsistent with previous attentional entrainment models. Instead, the new model of entrainment emphasizes increased attention to those aspects of the stimulus that the brain expects to be highly informative.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 10","pages":"2170-2200"},"PeriodicalIF":2.7,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142037773","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Active Inference and Reinforcement Learning: A Unified Inference on Continuous State and Action Spaces Under Partial Observability","authors":"Parvin Malekzadeh;Konstantinos N. Plataniotis","doi":"10.1162/neco_a_01698","DOIUrl":"10.1162/neco_a_01698","url":null,"abstract":"Reinforcement learning (RL) has garnered significant attention for developing decision-making agents that aim to maximize rewards, specified by an external supervisor, within fully observable environments. However, many real-world problems involve partial or noisy observations, where agents cannot access complete and accurate information about the environment. These problems are commonly formulated as partially observable Markov decision processes (POMDPs). Previous studies have tackled RL in POMDPs by either incorporating the memory of past actions and observations or by inferring the true state of the environment from observed data. Nevertheless, aggregating observations and actions over time becomes impractical in problems with large decision-making time horizons and high-dimensional spaces. Furthermore, inference-based RL approaches often require many environmental samples to perform well, as they focus solely on reward maximization and neglect uncertainty in the inferred state. Active inference (AIF) is a framework naturally formulated in POMDPs and directs agents to select actions by minimizing a function called expected free energy (EFE). This supplies reward-maximizing (or exploitative) behavior, as in RL, with information-seeking (or exploratory) behavior. Despite this exploratory behavior of AIF, its use is limited to problems with small time horizons and discrete spaces due to the computational challenges associated with EFE. In this article, we propose a unified principle that establishes a theoretical connection between AIF and RL, enabling seamless integration of these two approaches and overcoming their limitations in continuous space POMDP settings. We substantiate our findings with rigorous theoretical analysis, providing novel perspectives for using AIF in designing and implementing artificial agents. Experimental results demonstrate the superior learning capabilities of our method compared to other alternative RL approaches in solving partially observable tasks with continuous spaces. Notably, our approach harnesses information-seeking exploration, enabling it to effectively solve reward-free problems and rendering explicit task reward design by an external supervisor optional.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 10","pages":"2073-2135"},"PeriodicalIF":2.7,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142037772","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lærke Gebser Krohne;Ingeborg Helbech Hansen;Kristoffer H. Madsen
{"title":"On the Search for Data-Driven and Reproducible Schizophrenia Subtypes Using Resting State fMRI Data From Multiple Sites","authors":"Lærke Gebser Krohne;Ingeborg Helbech Hansen;Kristoffer H. Madsen","doi":"10.1162/neco_a_01689","DOIUrl":"10.1162/neco_a_01689","url":null,"abstract":"For decades, fMRI data have been used to search for biomarkers for patients with schizophrenia. Still, firm conclusions are yet to be made, which is often attributed to the high internal heterogeneity of the disorder. A promising way to disentangle the heterogeneity is to search for subgroups of patients with more homogeneous biological profiles. We applied an unsupervised multiple co-clustering (MCC) method to identify subtypes using functional connectivity data from a multisite resting-state data set. We merged data from two publicly available databases and split the data into a discovery data set (143 patients and 143 healthy controls (HC)) and an external test data set (63 patients and 63 HC) from independent sites. On the discovery data, we investigated the stability of the clustering toward data splits and initializations. Subsequently we searched for cluster solutions, also called “views,” with a significant diagnosis association and evaluated these based on their subject and feature cluster separability, and correlation to clinical manifestations as measured with the positive and negative syndrome scale (PANSS). Finally, we validated our findings by testing the diagnosis association on the external test data. A major finding of our study was that the stability of the clustering was highly dependent on variations in the data set, and even across initializations, we found only a moderate subject clustering stability. Nevertheless, we still discovered one view with a significant diagnosis association. This view reproducibly showed an overrepresentation of schizophrenia patients in three subject clusters, and one feature cluster showed a continuous trend, ranging from positive to negative connectivity values, when sorted according to the proportions of patients with schizophrenia. When investigating all patients, none of the feature clusters in the view were associated with severity of positive, negative, and generalized symptoms, indicating that the cluster solutions reflect other disease related mechanisms.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 9","pages":"1799-1831"},"PeriodicalIF":2.7,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141898955","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Spontaneous Emergence of Robustness to Light Variation in CNNs With a Precortically Inspired Module","authors":"J. Petkovic;R. Fioresi","doi":"10.1162/neco_a_01691","DOIUrl":"10.1162/neco_a_01691","url":null,"abstract":"The analogies between the mammalian primary visual cortex and the structure of CNNs used for image classification tasks suggest that the introduction of an additional preliminary convolutional module inspired by the mathematical modeling of the precortical neuronal circuits can improve robustness with respect to global light intensity and contrast variations in the input images. We validate this hypothesis using the popular databases MNIST, FashionMNIST, and SVHN for these variations once an extra module is added.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 9","pages":"1832-1853"},"PeriodicalIF":2.7,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141898956","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient Hyperdimensional Computing With Spiking Phasors","authors":"Jeff Orchard;P. Michael Furlong;Kathryn Simone","doi":"10.1162/neco_a_01693","DOIUrl":"10.1162/neco_a_01693","url":null,"abstract":"Hyperdimensional (HD) computing (also referred to as vector symbolic architectures, VSAs) offers a method for encoding symbols into vectors, allowing for those symbols to be combined in different ways to form other vectors in the same vector space. The vectors and operators form a compositional algebra, such that composite vectors can be decomposed back to their constituent vectors. Many useful algorithms have implementations in HD computing, such as classification, spatial navigation, language modeling, and logic. In this letter, we propose a spiking implementation of Fourier holographic reduced representation (FHRR), one of the most versatile VSAs. The phase of each complex number of an FHRR vector is encoded as a spike time within a cycle. Neuron models derived from these spiking phasors can perform the requisite vector operations to implement an FHRR. We demonstrate the power and versatility of our spiking networks in a number of foundational problem domains, including symbol binding and unbinding, spatial representation, function representation, function integration, and memory (i.e., signal delay).","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 9","pages":"1886-1911"},"PeriodicalIF":2.7,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141898952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Martin Magris;Mostafa Shabani;Alexandros Iosifidis
{"title":"Manifold Gaussian Variational Bayes on the Precision Matrix","authors":"Martin Magris;Mostafa Shabani;Alexandros Iosifidis","doi":"10.1162/neco_a_01686","DOIUrl":"10.1162/neco_a_01686","url":null,"abstract":"We propose an optimization algorithm for variational inference (VI) in complex models. Our approach relies on natural gradient updates where the variational space is a Riemann manifold. We develop an efficient algorithm for gaussian variational inference whose updates satisfy the positive definite constraint on the variational covariance matrix. Our manifold gaussian variational Bayes on the precision matrix (MGVBP) solution provides simple update rules, is straightforward to implement, and the use of the precision matrix parameterization has a significant computational advantage. Due to its black-box nature, MGVBP stands as a ready-to-use solution for VI in complex models. Over five data sets, we empirically validate our feasible approach on different statistical and econometric models, discussing its performance with respect to baseline methods.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 9","pages":"1744-1798"},"PeriodicalIF":2.7,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142009907","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yiming Jiang;Jinlan Liu;Dongpo Xu;Danilo P. Mandic
{"title":"UAdam: Unified Adam-Type Algorithmic Framework for Nonconvex Optimization","authors":"Yiming Jiang;Jinlan Liu;Dongpo Xu;Danilo P. Mandic","doi":"10.1162/neco_a_01692","DOIUrl":"10.1162/neco_a_01692","url":null,"abstract":"Adam-type algorithms have become a preferred choice for optimization in the deep learning setting; however, despite their success, their convergence is still not well understood. To this end, we introduce a unified framework for Adam-type algorithms, termed UAdam. It is equipped with a general form of the second-order moment, which makes it possible to include Adam and its existing and future variants as special cases, such as NAdam, AMSGrad, AdaBound, AdaFom, and Adan. The approach is supported by a rigorous convergence analysis of UAdam in the general nonconvex stochastic setting, showing that UAdam converges to the neighborhood of stationary points with a rate of O(1/T). Furthermore, the size of the neighborhood decreases as the parameter β1 increases. Importantly, our analysis only requires the first-order momentum factor to be close enough to 1, without any restrictions on the second-order momentum factor. Theoretical results also reveal the convergence conditions of vanilla Adam, together with the selection of appropriate hyperparameters. This provides a theoretical guarantee for the analysis, applications, and further developments of the whole general class of Adam-type algorithms. Finally, several numerical experiments are provided to support our theoretical findings.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 9","pages":"1912-1938"},"PeriodicalIF":2.7,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141898957","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hebbian Descent: A Unified View on Log-Likelihood Learning","authors":"Jan Melchior;Robin Schiewer;Laurenz Wiskott","doi":"10.1162/neco_a_01684","DOIUrl":"10.1162/neco_a_01684","url":null,"abstract":"This study discusses the negative impact of the derivative of the activation functions in the output layer of artificial neural networks, in particular in continual learning. We propose Hebbian descent as a theoretical framework to overcome this limitation, which is implemented through an alternative loss function for gradient descent we refer to as Hebbian descent loss. This loss is effectively the generalized log-likelihood loss and corresponds to an alternative weight update rule for the output layer wherein the derivative of the activation function is disregarded. We show how this update avoids vanishing error signals during backpropagation in saturated regions of the activation functions, which is particularly helpful in training shallow neural networks and deep neural networks where saturating activation functions are only used in the output layer. In combination with centering, Hebbian descent leads to better continual learning capabilities. It provides a unifying perspective on Hebbian learning, gradient descent, and generalized linear models, for all of which we discuss the advantages and disadvantages. Given activation functions with strictly positive derivative (as often the case in practice), Hebbian descent inherits the convergence properties of regular gradient descent. While established pairings of loss and output layer activation function (e.g., mean squared error with linear or cross-entropy with sigmoid/softmax) are subsumed by Hebbian descent, we provide general insights for designing arbitrary loss activation function combinations that benefit from Hebbian descent. For shallow networks, we show that Hebbian descent outperforms Hebbian learning, has a performance similar to regular gradient descent, and has a much better performance than all other tested update rules in continual learning. In combination with centering, Hebbian descent implements a forgetting mechanism that prevents catastrophic interference notably better than the other tested update rules. When training deep neural networks, our experimental results suggest that Hebbian descent has better or similar performance as gradient descent.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 9","pages":"1669-1712"},"PeriodicalIF":2.7,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142009906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Intrinsic Rewards for Exploration Without Harm From Observational Noise: A Simulation Study Based on the Free Energy Principle","authors":"Theodore Jerome Tinker;Kenji Doya;Jun Tani","doi":"10.1162/neco_a_01690","DOIUrl":"10.1162/neco_a_01690","url":null,"abstract":"In reinforcement learning (RL), artificial agents are trained to maximize numerical rewards by performing tasks. Exploration is essential in RL because agents must discover information before exploiting it. Two rewards encouraging efficient exploration are the entropy of action policy and curiosity for information gain. Entropy is well established in the literature, promoting randomized action selection. Curiosity is defined in a broad variety of ways in literature, promoting discovery of novel experiences. One example, prediction error curiosity, rewards agents for discovering observations they cannot accurately predict. However, such agents may be distracted by unpredictable observational noises known as curiosity traps. Based on the free energy principle (FEP), this letter proposes hidden state curiosity, which rewards agents by the KL divergence between the predictive prior and posterior probabilities of latent variables. We trained six types of agents to navigate mazes: baseline agents without rewards for entropy or curiosity and agents rewarded for entropy and/or either prediction error curiosity or hidden state curiosity. We find that entropy and curiosity result in efficient exploration, especially both employed together. Notably, agents with hidden state curiosity demonstrate resilience against curiosity traps, which hinder agents with prediction error curiosity. This suggests implementing the FEP that may enhance the robustness and generalization of RL models, potentially aligning the learning processes of artificial and biological agents.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 9","pages":"1854-1885"},"PeriodicalIF":2.7,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141898954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}