{"title":"Even simple neural nets cannot be trained reliably with a polynomial number of examples","authors":"H. Shvaytser","doi":"10.1109/IJCNN.1989.118691","DOIUrl":"https://doi.org/10.1109/IJCNN.1989.118691","url":null,"abstract":"A variation of L.G. Valiant's 'PAC' model of learnability (Commun. ACM, vol.27, no.11, p.1134-42, 1984; Proc. 9th Int. Joint Conf. Artif. Intell., Aug. 1985) is used to investigate the learning power of artificial neural nets with threshold nodes. It is shown that there are cases where simple nets require an exponential number of training examples for reliably determining their sets of parameters. Polynomially many training examples may not be enough to determine the set of parameters even for a net of three threshold nodes, if it has to perform reliably in two different environments.<<ETX>>","PeriodicalId":199877,"journal":{"name":"International 1989 Joint Conference on Neural Networks","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131824413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Consistent inference of probabilities in layered networks: predictions and generalizations","authors":"Naftali Tishby, E. Levin, S. Solla","doi":"10.1109/IJCNN.1989.118274","DOIUrl":"https://doi.org/10.1109/IJCNN.1989.118274","url":null,"abstract":"The problem of learning a general input-output relation using a layered neural network is discussed in a statistical framework. By imposing the consistency condition that the error minimization be equivalent to a likelihood maximization for training the network, the authors arrive at a Gibbs distribution on a canonical ensemble of networks with the same architecture. This statistical description enables them to evaluate the probability of a correct prediction of an independent example, after training the network on a given training set. The prediction probability is highly correlated with the generalization ability of the network, as measured outside the training set. This suggests a general and practical criterion for training layered networks by minimizing prediction errors. The authors demonstrate the utility of this criterion for selecting the optimal architecture in the continuity problem. As a theoretical application of the statistical formalism, they discuss the question of learning curves and estimate the sufficient training size needed for correct generalization, in a simple example.<<ETX>>","PeriodicalId":199877,"journal":{"name":"International 1989 Joint Conference on Neural Networks","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122323027","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Universal approximation using feedforward networks with non-sigmoid hidden layer activation functions","authors":"M. Stinchcombe, H. White","doi":"10.1109/IJCNN.1989.118640","DOIUrl":"https://doi.org/10.1109/IJCNN.1989.118640","url":null,"abstract":"K.M. Hornik, M. Stinchcombe, and H. White (Univ. of California at San Diego, Dept. of Economics Discussion Paper, June 1988; to appear in Neural Networks) showed that multilayer feedforward networks with as few as one hidden layer, no squashing at the output layer, and arbitrary sigmoid activation function at the hidden layer are universal approximators: they are capable of arbitrarily accurate approximation to arbitrary mappings, provided sufficiently many hidden units are available. The present authors obtain identical conclusions but do not require the hidden-unit activation to be sigmoid. Instead, it can be a rather general nonlinear function. Thus, multilayer feedforward networks possess universal approximation capabilities by virtue of the presence of intermediate layers with sufficiently many parallel processors; the properties of the intermediate-layer activation function are not so crucial. In particular, sigmoid activation functions are not necessary for universal approximation.<<ETX>>","PeriodicalId":199877,"journal":{"name":"International 1989 Joint Conference on Neural Networks","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121406716","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Performance of back propagation networks for associative database retrieval","authors":"V. Cherkassky, N. Vassilas","doi":"10.1109/IJCNN.1989.118562","DOIUrl":"https://doi.org/10.1109/IJCNN.1989.118562","url":null,"abstract":"Back-propagation networks have been successfully used to perform a variety of input-output mapping tasks for recognition, generalization, and classification. In spite of this method's popularity, virtually nothing is known about its saturation/capacity and, in more general terms, about its performance as an associative memory. The authors address these issues using associative database retrieval as an original application domain. Experimental results show that the quality of recall and the network capacity are very significantly affected by the network topology (the number of hidden units), data representation (encoding), and the choice of learning parameters. On the basis of their results and the fact that back-propagation learning is not recursive, the authors conclude that back-propagation networks can be used mainly as read-only associative memories and represent a poor choice for read-and-write associative memories.<<ETX>>","PeriodicalId":199877,"journal":{"name":"International 1989 Joint Conference on Neural Networks","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126094957","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A hierarchical neural network involving nonlinear spectral processing","authors":"O. Ersoy, D. Hong","doi":"10.1109/IJCNN.1989.118514","DOIUrl":"https://doi.org/10.1109/IJCNN.1989.118514","url":null,"abstract":"Summary form only given, as follows. A new neural network architecture called the hierarchical neural network (HNN) is introduced. The HNN involves a number of stages in which each stage can be a particular neural network (SNN). Between two SNNs there is a nonlinear transformation of those input vectors rejected by the first SNN. The HNN has many desirable properties such as optimized system complexity in the sense of minimized number of stages, high classification accuracy, minimized learning and recall times, and truly parallel architectures in which all SNNs are operating simultaneously without waiting for data from each other. The experiments performed in comparison to multilayered networks with backpropagation training indicated the superiority of the HNN.<<ETX>>","PeriodicalId":199877,"journal":{"name":"International 1989 Joint Conference on Neural Networks","volume":"176 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123570702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Supervised learning with artificial selection","authors":"M. Hagiwara, M. Nakagawa","doi":"10.1109/IJCNN.1989.118443","DOIUrl":"https://doi.org/10.1109/IJCNN.1989.118443","url":null,"abstract":"Summary form only given, as follows. Supervised learning with artificial selection is proposed as a way to escape from local minima. The concept of artificial selection is reasonable for nature. In the authors' method, the 'worst' hidden unit is detected, and then all the weights connected to the detected hidden unit are reset to small random values. According to simulations, only half the trials using conventional backpropagation converge, whereas all of the trials using the proposed method converge, and quickly do so.<<ETX>>","PeriodicalId":199877,"journal":{"name":"International 1989 Joint Conference on Neural Networks","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132500594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Towards a hardware realisable model of the neuron","authors":"D. Gorse, J. Taylor","doi":"10.1109/IJCNN.1989.118407","DOIUrl":"https://doi.org/10.1109/IJCNN.1989.118407","url":null,"abstract":"Summary form only given, as follows. A model of neural processing is proposed which is able to incorporate a great deal of neurophysiological detail, including effects associated with the mechanism of postsynaptic summation, cell surface geometry, and axo-axonal interactions and is capable of hardware realization as a probabilistic random access memory (pRAM). The model is an extension of earlier work by the authors, which by operating at much shorter time scales (on the order of the lifetime of a quantum of neurotransmitter in the synaptic cleft) allows a greater amount of information to be retrieved from the simulated spike train. The mathematical framework for the model appears to be that of an extended Markov process (involving the firing histories of the N neurons). Simulations of single units have yielded results in excellent agreement with theoretical predictions. The extended neural model is expected to be particularly applicable in situations where timing constraints are of special importance (such as the auditory cortex) or where firing thresholds are high, as is the case for the granule and pyramidal cells of the hippocampus.<<ETX>>","PeriodicalId":199877,"journal":{"name":"International 1989 Joint Conference on Neural Networks","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124092893","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Using back-propagation networks to assess several image representation schemes for object recognition","authors":"J. Lubin, K. Jones, A. Kornhauser","doi":"10.1109/IJCNN.1989.118464","DOIUrl":"https://doi.org/10.1109/IJCNN.1989.118464","url":null,"abstract":"Summary form only given, as follows. Two chapters of research are presented. The first constitutes a demonstration that backpropagation networks can be used as a content addressable memory for visual objects represented within digitized real-world images. For networks encoding two or three classes of traffic signs, classification generalization is demonstrated for objects at new positions on the image frame and also for new instances of a trained class of object. The new instance may even be a somewhat degraded representation. Given this optimistic introduction, the work evolves into a second, more comparative chapter. In this further probe, packpropagation networks are used as content addressable memories with which to determine the relative value of several different visual object representation schemes. These representation schemes are tested along multiple parameters to deduce the efficacy of the scheme itself, and the influence of network parameter changes on the learning and categorization of objects.<<ETX>>","PeriodicalId":199877,"journal":{"name":"International 1989 Joint Conference on Neural Networks","volume":"126 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116018574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"EEG spike detection using backpropagation networks","authors":"R. Eberhart, R. W. Dobbins, W. Webber","doi":"10.1109/IJCNN.1989.118551","DOIUrl":"https://doi.org/10.1109/IJCNN.1989.118551","url":null,"abstract":"Summary form only given, as follows. The design of a system to analyze electroencephalogram (EEG) signals for the detection of epileptiform spikes is described. The ultimate goal is real-time multichannel spike detection. Two main areas of development are reviewed. The first is the processing and characterization of the raw EEG data, including issues related to data rates, the number of data channels, and the tradeoffs between the amount of data preprocessing and the complexities of the neural net work required. The second is the selection and implementation of the neural net work architecture, including choices between supervised and unsupervised learning schemes, and among the many available learning algorithms for each network architecture. Interim results involving the analysis of single-channel EEG data are discussed. The relationship of the spike detection project to a similar effort in seizure detection is described.<<ETX>>","PeriodicalId":199877,"journal":{"name":"International 1989 Joint Conference on Neural Networks","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126527133","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The mathematical theory of learning algorithms for Boltzmann machines","authors":"H. Sussmann","doi":"10.1109/IJCNN.1989.118278","DOIUrl":"https://doi.org/10.1109/IJCNN.1989.118278","url":null,"abstract":"The author analyzes a version of a well-known learning algorithm for Boltzmann machines, based on the usual alternation between learning and hallucinating phases. He outlines the rigorous proof that, for suitable choices of the parameters, the evolution of the weights follows very closely, with very high probability, an integral trajectory of the gradient of the likelihood function whose global maxima are exactly the desired weight patterns.<<ETX>>","PeriodicalId":199877,"journal":{"name":"International 1989 Joint Conference on Neural Networks","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125748684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}