{"title":"Hardware-Friendly Implementation of Physical Reservoir Computing with CMOS-based Time-domain Analog Spiking Neurons","authors":"Nanako Kimura, Ckristian Duran, Zolboo Byambadorj, Ryosho Nakane, Tetsuya Iizuka","doi":"arxiv-2409.11612","DOIUrl":"https://doi.org/arxiv-2409.11612","url":null,"abstract":"This paper introduces an analog spiking neuron that utilizes time-domain\u0000information, i.e., a time interval of two signal transitions and a pulse width,\u0000to construct a spiking neural network (SNN) for a hardware-friendly physical\u0000reservoir computing (RC) on a complementary metal-oxide-semiconductor (CMOS)\u0000platform. A neuron with leaky integrate-and-fire is realized by employing two\u0000voltage-controlled oscillators (VCOs) with opposite sensitivities to the\u0000internal control voltage, and the neuron connection structure is restricted by\u0000the use of only 4 neighboring neurons on the 2-dimensional plane to feasibly\u0000construct a regular network topology. Such a system enables us to compose an\u0000SNN with a counter-based readout circuit, which simplifies the hardware\u0000implementation of the SNN. Moreover, another technical advantage thanks to the\u0000bottom-up integration is the capability of dynamically capturing every neuron\u0000state in the network, which can significantly contribute to finding guidelines\u0000on how to enhance the performance for various computational tasks in temporal\u0000information processing. Diverse nonlinear physical dynamics needed for RC can\u0000be realized by collective behavior through dynamic interaction between neurons,\u0000like coupled oscillators, despite the simple network structure. With behavioral\u0000system-level simulations, we demonstrate physical RC through short-term memory\u0000and exclusive OR tasks, and the spoken digit recognition task with an accuracy\u0000of 97.7% as well. Our system is considerably feasible for practical\u0000applications and also can be a useful platform for studying the mechanism of\u0000physical RC.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"95 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142268199","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"PReLU: Yet Another Single-Layer Solution to the XOR Problem","authors":"Rafael C. Pinto, Anderson R. Tavares","doi":"arxiv-2409.10821","DOIUrl":"https://doi.org/arxiv-2409.10821","url":null,"abstract":"This paper demonstrates that a single-layer neural network using Parametric\u0000Rectified Linear Unit (PReLU) activation can solve the XOR problem, a simple\u0000fact that has been overlooked so far. We compare this solution to the\u0000multi-layer perceptron (MLP) and the Growing Cosine Unit (GCU) activation\u0000function and explain why PReLU enables this capability. Our results show that\u0000the single-layer PReLU network can achieve 100% success rate in a wider range\u0000of learning rates while using only three learnable parameters.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"38 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142248988","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Inferno: An Extensible Framework for Spiking Neural Networks","authors":"Marissa Dominijanni","doi":"arxiv-2409.11567","DOIUrl":"https://doi.org/arxiv-2409.11567","url":null,"abstract":"This paper introduces Inferno, a software library built on top of PyTorch\u0000that is designed to meet distinctive challenges of using spiking neural\u0000networks (SNNs) for machine learning tasks. We describe the architecture of\u0000Inferno and key differentiators that make it uniquely well-suited to these\u0000tasks. We show how Inferno supports trainable heterogeneous delays on both CPUs\u0000and GPUs, and how Inferno enables a \"write once, apply everywhere\" development\u0000methodology for novel models and techniques. We compare Inferno's performance\u0000to BindsNET, a library aimed at machine learning with SNNs, and\u0000Brian2/Brian2CUDA which is popular in neuroscience. Among several examples, we\u0000show how the design decisions made by Inferno facilitate easily implementing\u0000the new methods of Nadafian and Ganjtabesh in delay learning with spike-timing\u0000dependent plasticity.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"23 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142248996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"MonoKAN: Certified Monotonic Kolmogorov-Arnold Network","authors":"Alejandro Polo-Molina, David Alfaya, Jose Portela","doi":"arxiv-2409.11078","DOIUrl":"https://doi.org/arxiv-2409.11078","url":null,"abstract":"Artificial Neural Networks (ANNs) have significantly advanced various fields\u0000by effectively recognizing patterns and solving complex problems. Despite these\u0000advancements, their interpretability remains a critical challenge, especially\u0000in applications where transparency and accountability are essential. To address\u0000this, explainable AI (XAI) has made progress in demystifying ANNs, yet\u0000interpretability alone is often insufficient. In certain applications, model\u0000predictions must align with expert-imposed requirements, sometimes exemplified\u0000by partial monotonicity constraints. While monotonic approaches are found in\u0000the literature for traditional Multi-layer Perceptrons (MLPs), they still face\u0000difficulties in achieving both interpretability and certified partial\u0000monotonicity. Recently, the Kolmogorov-Arnold Network (KAN) architecture, based\u0000on learnable activation functions parametrized as splines, has been proposed as\u0000a more interpretable alternative to MLPs. Building on this, we introduce a\u0000novel ANN architecture called MonoKAN, which is based on the KAN architecture\u0000and achieves certified partial monotonicity while enhancing interpretability.\u0000To achieve this, we employ cubic Hermite splines, which guarantee monotonicity\u0000through a set of straightforward conditions. Additionally, by using positive\u0000weights in the linear combinations of these splines, we ensure that the network\u0000preserves the monotonic relationships between input and output. Our experiments\u0000demonstrate that MonoKAN not only enhances interpretability but also improves\u0000predictive performance across the majority of benchmarks, outperforming\u0000state-of-the-art monotonic MLP approaches.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"46 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142268259","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bio-Inspired Mamba: Temporal Locality and Bioplausible Learning in Selective State Space Models","authors":"Jiahao Qin","doi":"arxiv-2409.11263","DOIUrl":"https://doi.org/arxiv-2409.11263","url":null,"abstract":"This paper introduces Bio-Inspired Mamba (BIM), a novel online learning\u0000framework for selective state space models that integrates biological learning\u0000principles with the Mamba architecture. BIM combines Real-Time Recurrent\u0000Learning (RTRL) with Spike-Timing-Dependent Plasticity (STDP)-like local\u0000learning rules, addressing the challenges of temporal locality and biological\u0000plausibility in training spiking neural networks. Our approach leverages the\u0000inherent connection between backpropagation through time and STDP, offering a\u0000computationally efficient alternative that maintains the ability to capture\u0000long-range dependencies. We evaluate BIM on language modeling, speech\u0000recognition, and biomedical signal analysis tasks, demonstrating competitive\u0000performance against traditional methods while adhering to biological learning\u0000principles. Results show improved energy efficiency and potential for\u0000neuromorphic hardware implementation. BIM not only advances the field of\u0000biologically plausible machine learning but also provides insights into the\u0000mechanisms of temporal information processing in biological neural networks.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"51 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142248987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Self-Contrastive Forward-Forward Algorithm","authors":"Xing Chen, Dongshu Liu, Jeremie Laydevant, Julie Grollier","doi":"arxiv-2409.11593","DOIUrl":"https://doi.org/arxiv-2409.11593","url":null,"abstract":"The Forward-Forward (FF) algorithm is a recent, purely forward-mode learning\u0000method, that updates weights locally and layer-wise and supports supervised as\u0000well as unsupervised learning. These features make it ideal for applications\u0000such as brain-inspired learning, low-power hardware neural networks, and\u0000distributed learning in large models. However, while FF has shown promise on\u0000written digit recognition tasks, its performance on natural images and\u0000time-series remains a challenge. A key limitation is the need to generate\u0000high-quality negative examples for contrastive learning, especially in\u0000unsupervised tasks, where versatile solutions are currently lacking. To address\u0000this, we introduce the Self-Contrastive Forward-Forward (SCFF) method, inspired\u0000by self-supervised contrastive learning. SCFF generates positive and negative\u0000examples applicable across different datasets, surpassing existing local\u0000forward algorithms for unsupervised classification accuracy on MNIST (MLP:\u000098.7%), CIFAR-10 (CNN: 80.75%), and STL-10 (CNN: 77.3%). Additionally, SCFF is\u0000the first to enable FF training of recurrent neural networks, opening the door\u0000to more complex tasks and continuous-time video and text processing.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"18 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142248986","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Evaluating the Efficacy of Instance Incremental vs. Batch Learning in Delayed Label Environments: An Empirical Study on Tabular Data Streaming for Fraud Detection","authors":"Kodjo Mawuena Amekoe, Mustapha Lebbah, Gregoire Jaffre, Hanene Azzag, Zaineb Chelly Dagdia","doi":"arxiv-2409.10111","DOIUrl":"https://doi.org/arxiv-2409.10111","url":null,"abstract":"Real-world tabular learning production scenarios typically involve evolving\u0000data streams, where data arrives continuously and its distribution may change\u0000over time. In such a setting, most studies in the literature regarding\u0000supervised learning favor the use of instance incremental algorithms due to\u0000their ability to adapt to changes in the data distribution. Another significant\u0000reason for choosing these algorithms is textit{avoid storing observations in\u0000memory} as commonly done in batch incremental settings. However, the design of\u0000instance incremental algorithms often assumes immediate availability of labels,\u0000which is an optimistic assumption. In many real-world scenarios, such as fraud\u0000detection or credit scoring, labels may be delayed. Consequently, batch\u0000incremental algorithms are widely used in many real-world tasks. This raises an\u0000important question: \"In delayed settings, is instance incremental learning the\u0000best option regarding predictive performance and computational efficiency?\"\u0000Unfortunately, this question has not been studied in depth, probably due to the\u0000scarcity of real datasets containing delayed information. In this study, we\u0000conduct a comprehensive empirical evaluation and analysis of this question\u0000using a real-world fraud detection problem and commonly used generated\u0000datasets. Our findings indicate that instance incremental learning is not the\u0000superior option, considering on one side state-of-the-art models such as\u0000Adaptive Random Forest (ARF) and other side batch learning models such as\u0000XGBoost. Additionally, when considering the interpretability of the learning\u0000systems, batch incremental solutions tend to be favored. Code:\u0000url{https://github.com/anselmeamekoe/DelayedLabelStream}","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"4 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142248991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Fixed-Parameter Tractability of the (1+1) Evolutionary Algorithm on Random Planted Vertex Covers","authors":"Jack Kearney, Frank Neumann, Andrew M. Sutton","doi":"arxiv-2409.10144","DOIUrl":"https://doi.org/arxiv-2409.10144","url":null,"abstract":"We present the first parameterized analysis of a standard (1+1) Evolutionary\u0000Algorithm on a distribution of vertex cover problems. We show that if the\u0000planted cover is at most logarithmic, restarting the (1+1) EA every $O(n log\u0000n)$ steps will find a cover at least as small as the planted cover in\u0000polynomial time for sufficiently dense random graphs $p > 0.71$. For\u0000superlogarithmic planted covers, we prove that the (1+1) EA finds a solution in\u0000fixed-parameter tractable time in expectation. We complement these theoretical investigations with a number of computational\u0000experiments that highlight the interplay between planted cover size, graph\u0000density and runtime.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142248989","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Kolmogorov-Arnold Transformer","authors":"Xingyi Yang, Xinchao Wang","doi":"arxiv-2409.10594","DOIUrl":"https://doi.org/arxiv-2409.10594","url":null,"abstract":"Transformers stand as the cornerstone of mordern deep learning.\u0000Traditionally, these models rely on multi-layer perceptron (MLP) layers to mix\u0000the information between channels. In this paper, we introduce the\u0000Kolmogorov-Arnold Transformer (KAT), a novel architecture that replaces MLP\u0000layers with Kolmogorov-Arnold Network (KAN) layers to enhance the\u0000expressiveness and performance of the model. Integrating KANs into\u0000transformers, however, is no easy feat, especially when scaled up.\u0000Specifically, we identify three key challenges: (C1) Base function. The\u0000standard B-spline function used in KANs is not optimized for parallel computing\u0000on modern hardware, resulting in slower inference speeds. (C2) Parameter and\u0000Computation Inefficiency. KAN requires a unique function for each input-output\u0000pair, making the computation extremely large. (C3) Weight initialization. The\u0000initialization of weights in KANs is particularly challenging due to their\u0000learnable activation functions, which are critical for achieving convergence in\u0000deep neural networks. To overcome the aforementioned challenges, we propose\u0000three key solutions: (S1) Rational basis. We replace B-spline functions with\u0000rational functions to improve compatibility with modern GPUs. By implementing\u0000this in CUDA, we achieve faster computations. (S2) Group KAN. We share the\u0000activation weights through a group of neurons, to reduce the computational load\u0000without sacrificing performance. (S3) Variance-preserving initialization. We\u0000carefully initialize the activation weights to make sure that the activation\u0000variance is maintained across layers. With these designs, KAT scales\u0000effectively and readily outperforms traditional MLP-based transformers.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"105 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142268200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shyam Venkatasubramanian, Ali Pezeshki, Vahid Tarokh
{"title":"Steinmetz Neural Networks for Complex-Valued Data","authors":"Shyam Venkatasubramanian, Ali Pezeshki, Vahid Tarokh","doi":"arxiv-2409.10075","DOIUrl":"https://doi.org/arxiv-2409.10075","url":null,"abstract":"In this work, we introduce a new approach to processing complex-valued data\u0000using DNNs consisting of parallel real-valued subnetworks with coupled outputs.\u0000Our proposed class of architectures, referred to as Steinmetz Neural Networks,\u0000leverages multi-view learning to construct more interpretable representations\u0000within the latent space. Subsequently, we present the Analytic Neural Network,\u0000which implements a consistency penalty that encourages analytic signal\u0000representations in the Steinmetz neural network's latent space. This penalty\u0000enforces a deterministic and orthogonal relationship between the real and\u0000imaginary components. Utilizing an information-theoretic construction, we\u0000demonstrate that the upper bound on the generalization error posited by the\u0000analytic neural network is lower than that of the general class of Steinmetz\u0000neural networks. Our numerical experiments demonstrate the improved performance\u0000and robustness to additive noise, afforded by our proposed networks on\u0000benchmark datasets and synthetic examples.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"38 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142248993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}