{"title":"SHIRE: Enhancing Sample Efficiency using Human Intuition in REinforcement Learning","authors":"Amogh Joshi, Adarsh Kumar Kosta, Kaushik Roy","doi":"arxiv-2409.09990","DOIUrl":"https://doi.org/arxiv-2409.09990","url":null,"abstract":"The ability of neural networks to perform robotic perception and control\u0000tasks such as depth and optical flow estimation, simultaneous localization and\u0000mapping (SLAM), and automatic control has led to their widespread adoption in\u0000recent years. Deep Reinforcement Learning has been used extensively in these\u0000settings, as it does not have the unsustainable training costs associated with\u0000supervised learning. However, DeepRL suffers from poor sample efficiency, i.e.,\u0000it requires a large number of environmental interactions to converge to an\u0000acceptable solution. Modern RL algorithms such as Deep Q Learning and Soft\u0000Actor-Critic attempt to remedy this shortcoming but can not provide the\u0000explainability required in applications such as autonomous robotics. Humans\u0000intuitively understand the long-time-horizon sequential tasks common in\u0000robotics. Properly using such intuition can make RL policies more explainable\u0000while enhancing their sample efficiency. In this work, we propose SHIRE, a\u0000novel framework for encoding human intuition using Probabilistic Graphical\u0000Models (PGMs) and using it in the Deep RL training pipeline to enhance sample\u0000efficiency. Our framework achieves 25-78% sample efficiency gains across the\u0000environments we evaluate at negligible overhead cost. Additionally, by teaching\u0000RL agents the encoded elementary behavior, SHIRE enhances policy\u0000explainability. A real-world demonstration further highlights the efficacy of\u0000policies trained using our framework.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"14 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142248994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jesus Barreda, Ashley Gomez, Ruben Puga, Kaixiong Zhou, Li Zhang
{"title":"COSCO: A Sharpness-Aware Training Framework for Few-shot Multivariate Time Series Classification","authors":"Jesus Barreda, Ashley Gomez, Ruben Puga, Kaixiong Zhou, Li Zhang","doi":"arxiv-2409.09645","DOIUrl":"https://doi.org/arxiv-2409.09645","url":null,"abstract":"Multivariate time series classification is an important task with widespread\u0000domains of applications. Recently, deep neural networks (DNN) have achieved\u0000state-of-the-art performance in time series classification. However, they often\u0000require large expert-labeled training datasets which can be infeasible in\u0000practice. In few-shot settings, i.e. only a limited number of samples per class\u0000are available in training data, DNNs show a significant drop in testing\u0000accuracy and poor generalization ability. In this paper, we propose to address\u0000these problems from an optimization and a loss function perspective.\u0000Specifically, we propose a new learning framework named COSCO consisting of a\u0000sharpness-aware minimization (SAM) optimization and a Prototypical loss\u0000function to improve the generalization ability of DNN for multivariate time\u0000series classification problems under few-shot setting. Our experiments\u0000demonstrate our proposed method outperforms the existing baseline methods. Our\u0000source code is available at: https://github.com/JRB9/COSCO.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"18 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142248995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Qi Huang, Sofoklis Kitharidis, Thomas Bäck, Niki van Stein
{"title":"TX-Gen: Multi-Objective Optimization for Sparse Counterfactual Explanations for Time-Series Classification","authors":"Qi Huang, Sofoklis Kitharidis, Thomas Bäck, Niki van Stein","doi":"arxiv-2409.09461","DOIUrl":"https://doi.org/arxiv-2409.09461","url":null,"abstract":"In time-series classification, understanding model decisions is crucial for\u0000their application in high-stakes domains such as healthcare and finance.\u0000Counterfactual explanations, which provide insights by presenting alternative\u0000inputs that change model predictions, offer a promising solution. However,\u0000existing methods for generating counterfactual explanations for time-series\u0000data often struggle with balancing key objectives like proximity, sparsity, and\u0000validity. In this paper, we introduce TX-Gen, a novel algorithm for generating\u0000counterfactual explanations based on the Non-dominated Sorting Genetic\u0000Algorithm II (NSGA-II). TX-Gen leverages evolutionary multi-objective\u0000optimization to find a diverse set of counterfactuals that are both sparse and\u0000valid, while maintaining minimal dissimilarity to the original time series. By\u0000incorporating a flexible reference-guided mechanism, our method improves the\u0000plausibility and interpretability of the counterfactuals without relying on\u0000predefined assumptions. Extensive experiments on benchmark datasets demonstrate\u0000that TX-Gen outperforms existing methods in generating high-quality\u0000counterfactuals, making time-series models more transparent and interpretable.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"190 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142249056","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"ORS: A novel Olive Ridley Survival inspired Meta-heuristic Optimization Algorithm","authors":"Niranjan Panigrahi, Sourav Kumar Bhoi, Debasis Mohapatra, Rashmi Ranjan Sahoo, Kshira Sagar Sahoo, Anil Mohapatra","doi":"arxiv-2409.09210","DOIUrl":"https://doi.org/arxiv-2409.09210","url":null,"abstract":"Meta-heuristic algorithmic development has been a thrust area of research\u0000since its inception. In this paper, a novel meta-heuristic optimization\u0000algorithm, Olive Ridley Survival (ORS), is proposed which is inspired from\u0000survival challenges faced by hatchlings of Olive Ridley sea turtle. A major\u0000fact about survival of Olive Ridley reveals that out of one thousand Olive\u0000Ridley hatchlings which emerge from nest, only one survive at sea due to\u0000various environmental and other factors. This fact acts as the backbone for\u0000developing the proposed algorithm. The algorithm has two major phases:\u0000hatchlings survival through environmental factors and impact of movement\u0000trajectory on its survival. The phases are mathematically modelled and\u0000implemented along with suitable input representation and fitness function. The\u0000algorithm is analysed theoretically. To validate the algorithm, fourteen\u0000mathematical benchmark functions from standard CEC test suites are evaluated\u0000and statistically tested. Also, to study the efficacy of ORS on recent complex\u0000benchmark functions, ten benchmark functions of CEC-06-2019 are evaluated.\u0000Further, three well-known engineering problems are solved by ORS and compared\u0000with other state-of-the-art meta-heuristics. Simulation results show that in\u0000many cases, the proposed ORS algorithm outperforms some state-of-the-art\u0000meta-heuristic optimization algorithms. The sub-optimal behavior of ORS in some\u0000recent benchmark functions is also observed.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"13 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142248990","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Training Spiking Neural Networks via Augmented Direct Feedback Alignment","authors":"Yongbo Zhang, Katsuma Inoue, Mitsumasa Nakajima, Toshikazu Hashimoto, Yasuo Kuniyoshi, Kohei Nakajima","doi":"arxiv-2409.07776","DOIUrl":"https://doi.org/arxiv-2409.07776","url":null,"abstract":"Spiking neural networks (SNNs), the models inspired by the mechanisms of real\u0000neurons in the brain, transmit and represent information by employing discrete\u0000action potentials or spikes. The sparse, asynchronous properties of information\u0000processing make SNNs highly energy efficient, leading to SNNs being promising\u0000solutions for implementing neural networks in neuromorphic devices. However,\u0000the nondifferentiable nature of SNN neurons makes it a challenge to train them.\u0000The current training methods of SNNs that are based on error backpropagation\u0000(BP) and precisely designing surrogate gradient are difficult to implement and\u0000biologically implausible, hindering the implementation of SNNs on neuromorphic\u0000devices. Thus, it is important to train SNNs with a method that is both\u0000physically implementatable and biologically plausible. In this paper, we\u0000propose using augmented direct feedback alignment (aDFA), a gradient-free\u0000approach based on random projection, to train SNNs. This method requires only\u0000partial information of the forward process during training, so it is easy to\u0000implement and biologically plausible. We systematically demonstrate the\u0000feasibility of the proposed aDFA-SNNs scheme, propose its effective working\u0000range, and analyze its well-performing settings by employing genetic algorithm.\u0000We also analyze the impact of crucial features of SNNs on the scheme, thus\u0000demonstrating its superiority and stability over BP and conventional direct\u0000feedback alignment. Our scheme can achieve competitive performance without\u0000accurate prior knowledge about the utilized system, thus providing a valuable\u0000reference for physically training SNNs.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"37 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188211","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Classifying Images with CoLaNET Spiking Neural Network -- the MNIST Example","authors":"Mikhail Kiselev","doi":"arxiv-2409.07833","DOIUrl":"https://doi.org/arxiv-2409.07833","url":null,"abstract":"In the present paper, it is shown how the columnar/layered CoLaNET spiking\u0000neural network (SNN) architecture can be used in supervised learning image\u0000classification tasks. Image pixel brightness is coded by the spike count during\u0000image presentation period. Image class label is indicated by activity of\u0000special SNN input nodes (one node per class). The CoLaNET classification\u0000accuracy is evaluated on the MNIST benchmark. It is demonstrated that CoLaNET\u0000is almost as accurate as the most advanced machine learning algorithms (not\u0000using convolutional approach).","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"13 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimizing Neural Network Performance and Interpretability with Diophantine Equation Encoding","authors":"Ronald Katende","doi":"arxiv-2409.07310","DOIUrl":"https://doi.org/arxiv-2409.07310","url":null,"abstract":"This paper explores the integration of Diophantine equations into neural\u0000network (NN) architectures to improve model interpretability, stability, and\u0000efficiency. By encoding and decoding neural network parameters as integer\u0000solutions to Diophantine equations, we introduce a novel approach that enhances\u0000both the precision and robustness of deep learning models. Our method\u0000integrates a custom loss function that enforces Diophantine constraints during\u0000training, leading to better generalization, reduced error bounds, and enhanced\u0000resilience against adversarial attacks. We demonstrate the efficacy of this\u0000approach through several tasks, including image classification and natural\u0000language processing, where improvements in accuracy, convergence, and\u0000robustness are observed. This study offers a new perspective on combining\u0000mathematical theory and machine learning to create more interpretable and\u0000efficient models.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"7 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188212","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Y-Drop: A Conductance based Dropout for fully connected layers","authors":"Efthymios Georgiou, Georgios Paraskevopoulos, Alexandros Potamianos","doi":"arxiv-2409.09088","DOIUrl":"https://doi.org/arxiv-2409.09088","url":null,"abstract":"In this work, we introduce Y-Drop, a regularization method that biases the\u0000dropout algorithm towards dropping more important neurons with higher\u0000probability. The backbone of our approach is neuron conductance, an\u0000interpretable measure of neuron importance that calculates the contribution of\u0000each neuron towards the end-to-end mapping of the network. We investigate the\u0000impact of the uniform dropout selection criterion on performance by assigning\u0000higher dropout probability to the more important units. We show that forcing\u0000the network to solve the task at hand in the absence of its important units\u0000yields a strong regularization effect. Further analysis indicates that Y-Drop\u0000yields solutions where more neurons are important, i.e have high conductance,\u0000and yields robust networks. In our experiments we show that the regularization\u0000effect of Y-Drop scales better than vanilla dropout w.r.t. the architecture\u0000size and consistently yields superior performance over multiple datasets and\u0000architecture combinations, with little tuning.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"190 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142248997","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Advanced LSTM Neural Networks for Predicting Directional Changes in Sector-Specific ETFs Using Machine Learning Techniques","authors":"Rifa Gowani, Zaryab Kanjiani","doi":"arxiv-2409.05778","DOIUrl":"https://doi.org/arxiv-2409.05778","url":null,"abstract":"Trading and investing in stocks for some is their full-time career, while for\u0000others, it's simply a supplementary income stream. Universal among all\u0000investors is the desire to turn a profit. The key to achieving this goal is\u0000diversification. Spreading investments across sectors is critical to\u0000profitability and maximizing returns. This study aims to gauge the viability of\u0000machine learning methods in practicing the principle of diversification to\u0000maximize portfolio returns. To test this, the study evaluates the Long-Short\u0000Term Memory (LSTM) model across nine different sectors and over 2,200 stocks\u0000using Vanguard's sector-based ETFs. The R-squared value across all sectors\u0000showed promising results, with an average of 0.8651 and a high of 0.942 for the\u0000VNQ ETF. These findings suggest that the LSTM model is a capable and viable\u0000model for accurately predicting directional changes across various industry\u0000sectors, helping investors diversify and grow their portfolios.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"8 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188230","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Comprehensive Comparison Between ANNs and KANs For Classifying EEG Alzheimer's Data","authors":"Akshay Sunkara, Sriram Sattiraju, Aakarshan Kumar, Zaryab Kanjiani, Himesh Anumala","doi":"arxiv-2409.05989","DOIUrl":"https://doi.org/arxiv-2409.05989","url":null,"abstract":"Alzheimer's Disease is an incurable cognitive condition that affects\u0000thousands of people globally. While some diagnostic methods exist for\u0000Alzheimer's Disease, many of these methods cannot detect Alzheimer's in its\u0000earlier stages. Recently, researchers have explored the use of\u0000Electroencephalogram (EEG) technology for diagnosing Alzheimer's. EEG is a\u0000noninvasive method of recording the brain's electrical signals, and EEG data\u0000has shown distinct differences between patients with and without Alzheimer's.\u0000In the past, Artificial Neural Networks (ANNs) have been used to predict\u0000Alzheimer's from EEG data, but these models sometimes produce false positive\u0000diagnoses. This study aims to compare losses between ANNs and Kolmogorov-Arnold\u0000Networks (KANs) across multiple types of epochs, learning rates, and nodes. The\u0000results show that across these different parameters, ANNs are more accurate in\u0000predicting Alzheimer's Disease from EEG signals.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"160 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}