Claus Metzner;Marius E. Yamakou;Dennis Voelkl;Achim Schilling;Patrick Krauss
{"title":"Quantifying and Maximizing the Information Flux in Recurrent Neural Networks","authors":"Claus Metzner;Marius E. Yamakou;Dennis Voelkl;Achim Schilling;Patrick Krauss","doi":"10.1162/neco_a_01651","DOIUrl":"10.1162/neco_a_01651","url":null,"abstract":"Free-running recurrent neural networks (RNNs), especially probabilistic models, generate an ongoing information flux that can be quantified with the mutual information I[x→(t),x→(t+1)] between subsequent system states x→. Although previous studies have shown that I depends on the statistics of the network's connection weights, it is unclear how to maximize I systematically and how to quantify the flux in large systems where computing the mutual information becomes intractable. Here, we address these questions using Boltzmann machines as model systems. We find that in networks with moderately strong connections, the mutual information I is approximately a monotonic transformation of the root-mean-square averaged Pearson correlations between neuron pairs, a quantity that can be efficiently computed even in large systems. Furthermore, evolutionary maximization of I[x→(t),x→(t+1)] reveals a general design principle for the weight matrices enabling the systematic construction of systems with a high spontaneous information flux. Finally, we simultaneously maximize information flux and the mean period length of cyclic attractors in the state-space of these dynamical networks. Our results are potentially useful for the construction of RNNs that serve as short-time memories or pattern generators.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 3","pages":"351-384"},"PeriodicalIF":2.9,"publicationDate":"2024-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139747774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Active Learning for Discrete Latent Variable Models","authors":"Aditi Jha;Zoe C. Ashwood;Jonathan W. Pillow","doi":"10.1162/neco_a_01646","DOIUrl":"10.1162/neco_a_01646","url":null,"abstract":"Active learning seeks to reduce the amount of data required to fit the parameters of a model, thus forming an important class of techniques in modern machine learning. However, past work on active learning has largely overlooked latent variable models, which play a vital role in neuroscience, psychology, and a variety of other engineering and scientific disciplines. Here we address this gap by proposing a novel framework for maximum-mutual-information input selection for discrete latent variable regression models. We first apply our method to a class of models known as mixtures of linear regressions (MLR). While it is well known that active learning confers no advantage for linear-gaussian regression models, we use Fisher information to show analytically that active learning can nevertheless achieve large gains for mixtures of such models, and we validate this improvement using both simulations and real-world data. We then consider a powerful class of temporally structured latent variable models given by a hidden Markov model (HMM) with generalized linear model (GLM) observations, which has recently been used to identify discrete states from animal decision-making data. We show that our method substantially reduces the amount of data needed to fit GLM-HMMs and outperforms a variety of approximate methods based on variational and amortized inference. Infomax learning for latent variable models thus offers a powerful approach for characterizing temporally structured latent states, with a wide variety of applications in neuroscience and beyond.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 3","pages":"437-474"},"PeriodicalIF":2.9,"publicationDate":"2024-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139747685","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Advantages of Persistent Cohomology in Estimating Animal Location From Grid Cell Population Activity","authors":"Daisuke Kawahara;Shigeyoshi Fujisawa","doi":"10.1162/neco_a_01645","DOIUrl":"10.1162/neco_a_01645","url":null,"abstract":"Many cognitive functions are represented as cell assemblies. In the case of spatial navigation, the population activity of place cells in the hippocampus and grid cells in the entorhinal cortex represents self-location in the environment. The brain cannot directly observe self-location information in the environment. Instead, it relies on sensory information and memory to estimate self-location. Therefore, estimating low-dimensional dynamics, such as the movement trajectory of an animal exploring its environment, from only the high-dimensional neural activity is important in deciphering the information represented in the brain. Most previous studies have estimated the low-dimensional dynamics (i.e., latent variables) behind neural activity by unsupervised learning with Bayesian population decoding using artificial neural networks or gaussian processes. Recently, persistent cohomology has been used to estimate latent variables from the phase information (i.e., circular coordinates) of manifolds created by neural activity. However, the advantages of persistent cohomology over Bayesian population decoding are not well understood. We compared persistent cohomology and Bayesian population decoding in estimating the animal location from simulated and actual grid cell population activity. We found that persistent cohomology can estimate the animal location with fewer neurons than Bayesian population decoding and robustly estimate the animal location from actual noisy data.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 3","pages":"385-411"},"PeriodicalIF":2.9,"publicationDate":"2024-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139747686","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Errata to “A Tutorial on the Spectral Theory of Markov Chains” by Eddie Seabrook and Laurenz Wiskott (Neural Computation, November 2023, Vol. 35, No. 11, pp. 1713–1796, https://doi.org/10.1162/neco_a_01611)","authors":"","doi":"10.1162/neco_e_01662","DOIUrl":"10.1162/neco_e_01662","url":null,"abstract":"","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 3","pages":"499-500"},"PeriodicalIF":2.9,"publicationDate":"2024-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139747701","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Learning Only on Boundaries: A Physics-Informed Neural Operator for Solving Parametric Partial Differential Equations in Complex Geometries","authors":"Zhiwei Fang;Sifan Wang;Paris Perdikaris","doi":"10.1162/neco_a_01647","DOIUrl":"10.1162/neco_a_01647","url":null,"abstract":"Recently, deep learning surrogates and neural operators have shown promise in solving partial differential equations (PDEs). However, they often require a large amount of training data and are limited to bounded domains. In this work, we present a novel physics-informed neural operator method to solve parameterized boundary value problems without labeled data. By reformulating the PDEs into boundary integral equations (BIEs), we can train the operator network solely on the boundary of the domain. This approach reduces the number of required sample points from O(Nd) to O(Nd-1), where d is the domain's dimension, leading to a significant acceleration of the training process. Additionally, our method can handle unbounded problems, which are unattainable for existing physics-informed neural networks (PINNs) and neural operators. Our numerical experiments show the effectiveness of parameterized complex geometries and unbounded problems.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 3","pages":"475-498"},"PeriodicalIF":2.9,"publicationDate":"2024-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139747703","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Cooperativity, Information Gain, and Energy Cost During Early LTP in Dendritic Spines","authors":"Jan Karbowski;Paulina Urban","doi":"10.1162/neco_a_01632","DOIUrl":"10.1162/neco_a_01632","url":null,"abstract":"We investigate a mutual relationship between information and energy during the early phase of LTP induction and maintenance in a large-scale system of mutually coupled dendritic spines, with discrete internal states and probabilistic dynamics, within the framework of nonequilibrium stochastic thermodynamics. In order to analyze this computationally intractable stochastic multidimensional system, we introduce a pair approximation, which allows us to reduce the spine dynamics into a lower-dimensional manageable system of closed equations. We found that the rates of information gain and energy attain their maximal values during an initial period of LTP (i.e., during stimulation), and after that, they recover to their baseline low values, as opposed to a memory trace that lasts much longer. This suggests that the learning phase is much more energy demanding than the memory phase. We show that positive correlations between neighboring spines increase both a duration of memory trace and energy cost during LTP, but the memory time per invested energy increases dramatically for very strong, positive synaptic cooperativity, suggesting a beneficial role of synaptic clustering on memory duration. In contrast, information gain after LTP is the largest for negative correlations, and energy efficiency of that information generally declines with increasing synaptic cooperativity. We also find that dendritic spines can use sparse representations for encoding long-term information, as both energetic and structural efficiencies of retained information and its lifetime exhibit maxima for low fractions of stimulated synapses during LTP. Moreover, we find that such efficiencies drop significantly with increasing the number of spines. In general, our stochastic thermodynamics approach provides a unifying framework for studying, from first principles, information encoding, and its energy cost during learning and memory in stochastic systems of interacting synapses.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 2","pages":"271-311"},"PeriodicalIF":2.9,"publicationDate":"2024-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138687565","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Emergence of Universal Computations Through Neural Manifold Dynamics","authors":"Joan Gort","doi":"10.1162/neco_a_01631","DOIUrl":"10.1162/neco_a_01631","url":null,"abstract":"There is growing evidence that many forms of neural computation may be implemented by low-dimensional dynamics unfolding at the population scale. However, neither the connectivity structure nor the general capabilities of these embedded dynamical processes are currently understood. In this work, the two most common formalisms of firing-rate models are evaluated using tools from analysis, topology, and nonlinear dynamics in order to provide plausible explanations for these problems. It is shown that low-rank structured connectivities predict the formation of invariant and globally attracting manifolds in all these models. Regarding the dynamics arising in these manifolds, it is proved they are topologically equivalent across the considered formalisms. This letter also shows that under the low-rank hypothesis, the flows emerging in neural manifolds, including input-driven systems, are universal, which broadens previous findings. It explores how low-dimensional orbits can bear the production of continuous sets of muscular trajectories, the implementation of central pattern generators, and the storage of memory states. These dynamics can robustly simulate any Turing machine over arbitrary bounded memory strings, virtually endowing rate models with the power of universal computation. In addition, the letter shows how the low-rank hypothesis predicts the parsimonious correlation structure observed in cortical activity. Finally, it discusses how this theory could provide a useful tool from which to study neuropsychological phenomena using mathematical methods.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 2","pages":"227-270"},"PeriodicalIF":2.9,"publicationDate":"2024-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138687695","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Q&A Label Learning","authors":"Kota Kawamoto;Masato Uchida","doi":"10.1162/neco_a_01633","DOIUrl":"10.1162/neco_a_01633","url":null,"abstract":"Assigning labels to instances is crucial for supervised machine learning. In this letter, we propose a novel annotation method, Q&A labeling, which involves a question generator that asks questions about the labels of the instances to be assigned and an annotator that answers the questions and assigns the corresponding labels to the instances. We derived a generative model of labels assigned according to two Q&A labeling procedures that differ in the way questions are asked and answered. We showed that in both procedures, the derived model is partially consistent with that assumed in previous studies. The main distinction of this study from previous ones lies in the fact that the label generative model was not assumed but, rather, derived based on the definition of a specific annotation method, Q&A labeling. We also derived a loss function to evaluate the classification risk of ordinary supervised machine learning using instances assigned Q&A labels and evaluated the upper bound of the classification error. The results indicate statistical consistency in learning with Q&A labels.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 2","pages":"312-349"},"PeriodicalIF":2.9,"publicationDate":"2024-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138687568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
C. Daniel Greenidge;Benjamin Scholl;Jacob L. Yates;Jonathan W. Pillow
{"title":"Efficient Decoding of Large-Scale Neural Population Responses With Gaussian-Process Multiclass Regression","authors":"C. Daniel Greenidge;Benjamin Scholl;Jacob L. Yates;Jonathan W. Pillow","doi":"10.1162/neco_a_01630","DOIUrl":"10.1162/neco_a_01630","url":null,"abstract":"Neural decoding methods provide a powerful tool for quantifying the information content of neural population codes and the limits imposed by correlations in neural activity. However, standard decoding methods are prone to overfitting and scale poorly to high-dimensional settings. Here, we introduce a novel decoding method to overcome these limitations. Our approach, the gaussian process multiclass decoder (GPMD), is well suited to decoding a continuous low-dimensional variable from high-dimensional population activity and provides a platform for assessing the importance of correlations in neural population codes. The GPMD is a multinomial logistic regression model with a gaussian process prior over the decoding weights. The prior includes hyperparameters that govern the smoothness of each neuron's decoding weights, allowing automatic pruning of uninformative neurons during inference. We provide a variational inference method for fitting the GPMD to data, which scales to hundreds or thousands of neurons and performs well even in data sets with more neurons than trials. We apply the GPMD to recordings from primary visual cortex in three species: monkey, ferret, and mouse. Our decoder achieves state-of-the-art accuracy on all three data sets and substantially outperforms independent Bayesian decoding, showing that knowledge of the correlation structure is essential for optimal decoding in all three species.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 2","pages":"175-226"},"PeriodicalIF":2.9,"publicationDate":"2024-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138687239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Modeling the Role of Contour Integration in Visual Inference","authors":"Salman Khan;Alexander Wong;Bryan Tripp","doi":"10.1162/neco_a_01625","DOIUrl":"10.1162/neco_a_01625","url":null,"abstract":"Under difficult viewing conditions, the brain's visual system uses a variety of recurrent modulatory mechanisms to augment feedforward processing. One resulting phenomenon is contour integration, which occurs in the primary visual (V1) cortex and strengthens neural responses to edges if they belong to a larger smooth contour. Computational models have contributed to an understanding of the circuit mechanisms of contour integration, but less is known about its role in visual perception. To address this gap, we embedded a biologically grounded model of contour integration in a task-driven artificial neural network and trained it using a gradient-descent variant. We used this model to explore how brain-like contour integration may be optimized for high-level visual objectives as well as its potential roles in perception. When the model was trained to detect contours in a background of random edges, a task commonly used to examine contour integration in the brain, it closely mirrored the brain in terms of behavior, neural responses, and lateral connection patterns. When trained on natural images, the model enhanced weaker contours and distinguished whether two points lay on the same versus different contours. The model learned robust features that generalized well to out-of-training-distribution stimuli. Surprisingly, and in contrast with the synthetic task, a parameter-matched control network without recurrence performed the same as or better than the model on the natural-image tasks. Thus, a contour integration mechanism is not essential to perform these more naturalistic contour-related tasks. Finally, the best performance in all tasks was achieved by a modified contour integration model that did not distinguish between excitatory and inhibitory neurons.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 1","pages":"33-74"},"PeriodicalIF":2.9,"publicationDate":"2023-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10534913","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138489120","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}