A. Gepperth, Thomas Hecht, Mathieu Lefort, Ursula Körner
{"title":"Biologically inspired incremental learning for high-dimensional spaces","authors":"A. Gepperth, Thomas Hecht, Mathieu Lefort, Ursula Körner","doi":"10.1109/DEVLRN.2015.7346155","DOIUrl":null,"url":null,"abstract":"We propose an incremental, highly parallelizable, and constant-time complexity neural learning architecture for multi-class classification (and regression) problems that remains resource-efficient even when the number of input dimensions is very high (≥ 1000). This so-called projection-prediction (PRO-PRE) architecture is strongly inspired by biological information processing in that it uses a prototype-based, topologically organized hidden layer that updates hidden layer weights whenever an error occurs. The employed self-organizing map (SOM) learning adapts only the weights of localized neural sub-populations that are similar to the input, which explicitly avoids the catastrophic forgetting effect of MLPs in case new input statistics are presented. The readout layer applies linear regression to hidden layer activities subjected to a transfer function, making the whole system capable of representing strongly non-linear decision boundaries. The resource-efficiency of the algorithm stems from approximating similarity in the input space by proximity in the SOM layer due to the topological SOM projection property. This avoids the storage of inter-cluster distances (quadratic in number of hidden layer elements) or input space covariance matrices (quadratic in input dimensions) as other incremental algorithms typically do. Tests on the popular MNIST handwritten digit benchmark show that the algorithm compares favorably to state-of-the-art results, and parallelizability is demonstrated by analyzing the efficiency of a parallel GPU implementation of the architecture.","PeriodicalId":164756,"journal":{"name":"2015 Joint IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob)","volume":"101 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 Joint IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DEVLRN.2015.7346155","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9
Abstract
We propose an incremental, highly parallelizable, and constant-time complexity neural learning architecture for multi-class classification (and regression) problems that remains resource-efficient even when the number of input dimensions is very high (≥ 1000). This so-called projection-prediction (PRO-PRE) architecture is strongly inspired by biological information processing in that it uses a prototype-based, topologically organized hidden layer that updates hidden layer weights whenever an error occurs. The employed self-organizing map (SOM) learning adapts only the weights of localized neural sub-populations that are similar to the input, which explicitly avoids the catastrophic forgetting effect of MLPs in case new input statistics are presented. The readout layer applies linear regression to hidden layer activities subjected to a transfer function, making the whole system capable of representing strongly non-linear decision boundaries. The resource-efficiency of the algorithm stems from approximating similarity in the input space by proximity in the SOM layer due to the topological SOM projection property. This avoids the storage of inter-cluster distances (quadratic in number of hidden layer elements) or input space covariance matrices (quadratic in input dimensions) as other incremental algorithms typically do. Tests on the popular MNIST handwritten digit benchmark show that the algorithm compares favorably to state-of-the-art results, and parallelizability is demonstrated by analyzing the efficiency of a parallel GPU implementation of the architecture.