Seongil Im;Jae-Seung Jeong;Junseo Lee;Changhwan Shin;Jeong Ho Cho;Hyunsu Ju
{"title":"Column Row Convolutional Neural Network: Reducing Parameters for Efficient Image Processing","authors":"Seongil Im;Jae-Seung Jeong;Junseo Lee;Changhwan Shin;Jeong Ho Cho;Hyunsu Ju","doi":"10.1162/neco_a_01653","DOIUrl":"10.1162/neco_a_01653","url":null,"abstract":"Recent advancements in deep learning have achieved significant progress by increasing the number of parameters in a given model. However, this comes at the cost of computing resources, prompting researchers to explore model compression techniques that reduce the number of parameters while maintaining or even improving performance. Convolutional neural networks (CNN) have been recognized as more efficient and effective than fully connected (FC) networks. We propose a column row convolutional neural network (CRCNN) in this letter that applies 1D convolution to image data, significantly reducing the number of learning parameters and operational steps. The CRCNN uses column and row local receptive fields to perform data abstraction, concatenating each direction's feature before connecting it to an FC layer. Experimental results demonstrate that the CRCNN maintains comparable accuracy while reducing the number of parameters and compared to prior work. Moreover, the CRCNN is employed for one-class anomaly detection, demonstrating its feasibility for various applications.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 4","pages":"744-758"},"PeriodicalIF":2.9,"publicationDate":"2024-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140066263","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Vidyesh Rao Anisetti;Ananth Kandala;Benjamin Scellier;J. M. Schwarz
{"title":"Frequency Propagation: Multimechanism Learning in Nonlinear Physical Networks","authors":"Vidyesh Rao Anisetti;Ananth Kandala;Benjamin Scellier;J. M. Schwarz","doi":"10.1162/neco_a_01648","DOIUrl":"10.1162/neco_a_01648","url":null,"abstract":"We introduce frequency propagation, a learning algorithm for nonlinear physical networks. In a resistive electrical circuit with variable resistors, an activation current is applied at a set of input nodes at one frequency and an error current is applied at a set of output nodes at another frequency. The voltage response of the circuit to these boundary currents is the superposition of an activation signal and an error signal whose coefficients can be read in different frequencies of the frequency domain. Each conductance is updated proportionally to the product of the two coefficients. The learning rule is local and proved to perform gradient descent on a loss function. We argue that frequency propagation is an instance of a multimechanism learning strategy for physical networks, be it resistive, elastic, or flow networks. Multimechanism learning strategies incorporate at least two physical quantities, potentially governed by independent physical mechanisms, to act as activation and error signals in the training process. Locally available information about these two signals is then used to update the trainable parameters to perform gradient descent. We demonstrate how earlier work implementing learning via chemical signaling in flow networks (Anisetti, Scellier, et al., 2023) also falls under the rubric of multimechanism learning.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 4","pages":"596-620"},"PeriodicalIF":2.9,"publicationDate":"2024-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140066264","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Learning Korobov Functions by Correntropy and Convolutional Neural Networks","authors":"Zhiying Fang;Tong Mao;Jun Fan","doi":"10.1162/neco_a_01650","DOIUrl":"10.1162/neco_a_01650","url":null,"abstract":"Combining information-theoretic learning with deep learning has gained significant attention in recent years, as it offers a promising approach to tackle the challenges posed by big data. However, the theoretical understanding of convolutional structures, which are vital to many structured deep learning models, remains incomplete. To partially bridge this gap, this letter aims to develop generalization analysis for deep convolutional neural network (CNN) algorithms using learning theory. Specifically, we focus on investigating robust regression using correntropy-induced loss functions derived from information-theoretic learning. Our analysis demonstrates an explicit convergence rate for deep CNN-based robust regression algorithms when the target function resides in the Korobov space. This study sheds light on the theoretical underpinnings of CNNs and provides a framework for understanding their performance and limitations.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 4","pages":"718-743"},"PeriodicalIF":2.9,"publicationDate":"2024-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140066267","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Lateral Connections Improve Generalizability of Learning in a Simple Neural Network","authors":"Garrett Crutcher","doi":"10.1162/neco_a_01640","DOIUrl":"10.1162/neco_a_01640","url":null,"abstract":"To navigate the world around us, neural circuits rapidly adapt to their environment learning generalizable strategies to decode information. When modeling these learning strategies, network models find the optimal solution to satisfy one task condition but fail when introduced to a novel task or even a different stimulus in the same space. In the experiments described in this letter, I investigate the role of lateral gap junctions in learning generalizable strategies to process information. Lateral gap junctions are formed by connexin proteins creating an open pore that allows for direct electrical signaling between two neurons. During neural development, the rate of gap junctions is high, and daughter cells that share similar tuning properties are more likely to be connected by these junctions. Gap junctions are highly plastic and get heavily pruned throughout development. I hypothesize that they mediate generalized learning by imprinting the weighting structure within a layer to avoid overfitting to one task condition. To test this hypothesis, I implemented a feedforward probabilistic neural network mimicking a cortical fast spiking neuron circuit that is heavily involved in movement. Many of these cells are tuned to speeds that I used as the input stimulus for the network to estimate. When training this network using a delta learning rule, both a laterally connected network and an unconnected network can estimate a single speed. However, when asking the network to estimate two or more speeds, alternated in training, an unconnected network either cannot learn speed or optimizes to a singular speed, while the laterally connected network learns the generalizable strategy and can estimate both speeds. These results suggest that lateral gap junctions between neurons enable generalized learning, which may help explain learning differences across life span.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 4","pages":"705-717"},"PeriodicalIF":2.9,"publicationDate":"2024-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140066266","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Probing the Structure and Functional Properties of the Dropout-Induced Correlated Variability in Convolutional Neural Networks","authors":"Xu Pan;Ruben Coen-Cagli;Odelia Schwartz","doi":"10.1162/neco_a_01652","DOIUrl":"10.1162/neco_a_01652","url":null,"abstract":"Computational neuroscience studies have shown that the structure of neural variability to an unchanged stimulus affects the amount of information encoded. Some artificial deep neural networks, such as those with Monte Carlo dropout layers, also have variable responses when the input is fixed. However, the structure of the trial-by-trial neural covariance in neural networks with dropout has not been studied, and its role in decoding accuracy is unknown. We studied the above questions in a convolutional neural network model with dropout in both the training and testing phases. We found that trial-by-trial correlation between neurons (i.e., noise correlation) is positive and low dimensional. Neurons that are close in a feature map have larger noise correlation. These properties are surprisingly similar to the findings in the visual cortex. We further analyzed the alignment of the main axes of the covariance matrix. We found that different images share a common trial-by-trial noise covariance subspace, and they are aligned with the global signal covariance. This evidence that the noise covariance is aligned with signal covariance suggests that noise covariance in dropout neural networks reduces network accuracy, which we further verified directly with a trial-shuffling procedure commonly used in neuroscience. These findings highlight a previously overlooked aspect of dropout layers that can affect network performance. Such dropout networks could also potentially be a computational model of neural variability.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 4","pages":"621-644"},"PeriodicalIF":2.9,"publicationDate":"2024-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140066271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Madison Cotteret;Hugh Greatorex;Martin Ziegler;Elisabetta Chicca
{"title":"Vector Symbolic Finite State Machines in Attractor Neural Networks","authors":"Madison Cotteret;Hugh Greatorex;Martin Ziegler;Elisabetta Chicca","doi":"10.1162/neco_a_01638","DOIUrl":"10.1162/neco_a_01638","url":null,"abstract":"Hopfield attractor networks are robust distributed models of human memory, but they lack a general mechanism for effecting state-dependent attractor transitions in response to input. We propose construction rules such that an attractor network may implement an arbitrary finite state machine (FSM), where states and stimuli are represented by high-dimensional random vectors and all state transitions are enacted by the attractor network's dynamics. Numerical simulations show the capacity of the model, in terms of the maximum size of implementable FSM, to be linear in the size of the attractor network for dense bipolar state vectors and approximately quadratic for sparse binary state vectors. We show that the model is robust to imprecise and noisy weights, and so a prime candidate for implementation with high-density but unreliable devices. By endowing attractor networks with the ability to emulate arbitrary FSMs, we propose a plausible path by which FSMs could exist as a distributed computational primitive in biological neural networks.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 4","pages":"549-595"},"PeriodicalIF":2.9,"publicationDate":"2024-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10535093","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140066273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alireza Poshtkohi;John Wade;Liam McDaid;Junxiu Liu;Mark L. Dallas;Angela Bithell
{"title":"Mathematical Modeling of PI3K/Akt Pathway in Microglia","authors":"Alireza Poshtkohi;John Wade;Liam McDaid;Junxiu Liu;Mark L. Dallas;Angela Bithell","doi":"10.1162/neco_a_01643","DOIUrl":"10.1162/neco_a_01643","url":null,"abstract":"The motility of microglia involves intracellular signaling pathways that are predominantly controlled by changes in cytosolic Ca2+ and activation of PI3K/Akt (phosphoinositide-3-kinase/protein kinase B). In this letter, we develop a novel biophysical model for cytosolic Ca2+ activation of the PI3K/Akt pathway in microglia where Ca2+ influx is mediated by both P2Y purinergic receptors (P2YR) and P2X purinergic receptors (P2XR). The model parameters are estimated by employing optimization techniques to fit the model to phosphorylated Akt (pAkt) experimental modeling/in vitro data. The integrated model supports the hypothesis that Ca2+ influx via P2YR and P2XR can explain the experimentally reported biphasic transient responses in measuring pAkt levels. Our predictions reveal new quantitative insights into P2Rs on how they regulate Ca2+ and Akt in terms of physiological interactions and transient responses. It is shown that the upregulation of P2X receptors through a repetitive application of agonist results in a continual increase in the baseline [Ca2+], which causes the biphasic response to become a monophasic response which prolongs elevated levels of pAkt.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 4","pages":"645-676"},"PeriodicalIF":2.9,"publicationDate":"2024-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140066268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Toon Van de Maele;Tim Verbelen;Pietro Mazzaglia;Stefano Ferraro;Bart Dhoedt
{"title":"Object-Centric Scene Representations Using Active Inference","authors":"Toon Van de Maele;Tim Verbelen;Pietro Mazzaglia;Stefano Ferraro;Bart Dhoedt","doi":"10.1162/neco_a_01637","DOIUrl":"10.1162/neco_a_01637","url":null,"abstract":"Representing a scene and its constituent objects from raw sensory data is a core ability for enabling robots to interact with their environment. In this letter, we propose a novel approach for scene understanding, leveraging an object-centric generative model that enables an agent to infer object category and pose in an allocentric reference frame using active inference, a neuro-inspired framework for action and perception. For evaluating the behavior of an active vision agent, we also propose a new benchmark where, given a target viewpoint of a particular object, the agent needs to find the best matching viewpoint given a workspace with randomly positioned objects in 3D. We demonstrate that our active inference agent is able to balance epistemic foraging and goal-driven behavior, and quantitatively outperforms both supervised and reinforcement learning baselines by more than a factor of two in terms of success rate.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 4","pages":"677-704"},"PeriodicalIF":2.9,"publicationDate":"2024-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140066269","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Vincent Painchaud;Patrick Desrosiers;Nicolas Doyon
{"title":"The Determining Role of Covariances in Large Networks of Stochastic Neurons","authors":"Vincent Painchaud;Patrick Desrosiers;Nicolas Doyon","doi":"10.1162/neco_a_01656","DOIUrl":"10.1162/neco_a_01656","url":null,"abstract":"Biological neural networks are notoriously hard to model due to their stochastic behavior and high dimensionality. We tackle this problem by constructing a dynamical model of both the expectations and covariances of the fractions of active and refractory neurons in the network’s populations. We do so by describing the evolution of the states of individual neurons with a continuous-time Markov chain, from which we formally derive a low-dimensional dynamical system. This is done by solving a moment closure problem in a way that is compatible with the nonlinearity and boundedness of the activation function. Our dynamical system captures the behavior of the high-dimensional stochastic model even in cases where the mean-field approximation fails to do so. Taking into account the second-order moments modifies the solutions that would be obtained with the mean-field approximation and can lead to the appearance or disappearance of fixed points and limit cycles. We moreover perform numerical experiments where the mean-field approximation leads to periodically oscillating solutions, while the solutions of the second-order model can be interpreted as an average taken over many realizations of the stochastic model. Altogether, our results highlight the importance of including higher moments when studying stochastic networks and deepen our understanding of correlated neuronal activity.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 6","pages":"1121-1162"},"PeriodicalIF":2.7,"publicationDate":"2024-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140805835","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"How Does the Inner Retinal Network Shape the Ganglion Cells Receptive Field? A Computational Study","authors":"Evgenia Kartsaki;Gerrit Hilgen;Evelyne Sernagor;Bruno Cessac","doi":"10.1162/neco_a_01663","DOIUrl":"10.1162/neco_a_01663","url":null,"abstract":"We consider a model of basic inner retinal connectivity where bipolar and amacrine cells interconnect and both cell types project onto ganglion cells, modulating their response output to the brain visual areas. We derive an analytical formula for the spatiotemporal response of retinal ganglion cells to stimuli, taking into account the effects of amacrine cells inhibition. This analysis reveals two important functional parameters of the network: (1) the intensity of the interactions between bipolar and amacrine cells and (2) the characteristic timescale of these responses. Both parameters have a profound combined impact on the spatiotemporal features of retinal ganglion cells’ responses to light. The validity of the model is confirmed by faithfully reproducing pharmacogenetic experimental results obtained by stimulating excitatory DREADDs (Designer Receptors Exclusively Activated by Designer Drugs) expressed on ganglion cells and amacrine cells’ subclasses, thereby modifying the inner retinal network activity to visual stimuli in a complex, entangled manner. Our mathematical model allows us to explore and decipher these complex effects in a manner that would not be feasible experimentally and provides novel insights in retinal dynamics.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 6","pages":"1041-1083"},"PeriodicalIF":2.7,"publicationDate":"2024-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140805848","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}