{"title":"Nonlinear reinforcement schemes for learning automata","authors":"H. E. Garcia, Abhik Ray","doi":"10.1109/CDC.1990.204017","DOIUrl":null,"url":null,"abstract":"The development and evaluation of two novel nonlinear reinforcement schemes for learning automata are presented. These schemes are designed to increase the rate of adaptation of the existing L/sub R-P/ schemes while interacting with nonstationary environments. The first of these two schemes is called a nonlinear scheme incorporating history (NSIH) and the second a nonlinear scheme with unstable zones (NSWUZ). The prime objective of these algorithms is to reduce the number of iterations needed for the action probability vector to reach the desired level of accuracy rather than converge to a specific unit vector in the Cartesian coordinate. Simulation experiments have been conducted to assess the learning properties of NSIH and NSWUZ in nonstationary environments. The simulation results show that the proposed nonlinear algorithms respond to environmental changes faster than the L/sub R-P/ scheme.<<ETX>>","PeriodicalId":287089,"journal":{"name":"29th IEEE Conference on Decision and Control","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1990-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"29th IEEE Conference on Decision and Control","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CDC.1990.204017","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9
Abstract
The development and evaluation of two novel nonlinear reinforcement schemes for learning automata are presented. These schemes are designed to increase the rate of adaptation of the existing L/sub R-P/ schemes while interacting with nonstationary environments. The first of these two schemes is called a nonlinear scheme incorporating history (NSIH) and the second a nonlinear scheme with unstable zones (NSWUZ). The prime objective of these algorithms is to reduce the number of iterations needed for the action probability vector to reach the desired level of accuracy rather than converge to a specific unit vector in the Cartesian coordinate. Simulation experiments have been conducted to assess the learning properties of NSIH and NSWUZ in nonstationary environments. The simulation results show that the proposed nonlinear algorithms respond to environmental changes faster than the L/sub R-P/ scheme.<>