{"title":"Reinforcement learning algorithms as function optimizers","authors":"Ronald J. Williams","doi":"10.1109/IJCNN.1989.118683","DOIUrl":null,"url":null,"abstract":"Any nonassociative reinforcement learning algorithm can be viewed as a method for performing function optimization through (possibly noise-corrupted) sampling of function values. A description is given of the results of simulations in which the optima of several deterministic functions studied by D.H. Ackley (Ph.D. Diss., Carnegie-Mellon Univ., 1987) were sought using variants of REINFORCE algorithms. Results obtained for certain of these algorithms compare favorably to the best results found by Ackley.<<ETX>>","PeriodicalId":199877,"journal":{"name":"International 1989 Joint Conference on Neural Networks","volume":"77 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1989-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"20","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International 1989 Joint Conference on Neural Networks","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IJCNN.1989.118683","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 20
Abstract
Any nonassociative reinforcement learning algorithm can be viewed as a method for performing function optimization through (possibly noise-corrupted) sampling of function values. A description is given of the results of simulations in which the optima of several deterministic functions studied by D.H. Ackley (Ph.D. Diss., Carnegie-Mellon Univ., 1987) were sought using variants of REINFORCE algorithms. Results obtained for certain of these algorithms compare favorably to the best results found by Ackley.<>