Zhiping Peng, Delong Cui, Yuanjia Ma, Jianbin Xiong, Bo Xu, Weiwei Lin
{"title":"R-Learning and Gaussian Process Regression Algorithm for Cloud Job Access Control","authors":"Zhiping Peng, Delong Cui, Yuanjia Ma, Jianbin Xiong, Bo Xu, Weiwei Lin","doi":"10.1109/CSCloud.2016.15","DOIUrl":null,"url":null,"abstract":"Reinforcement learning is an area of machine learning inspired by behaviorist psychology, concerned with how software agents ought to take actions in an environment so as to maximize some notion of cumulative reward. Recently reinforcement learning has been given abroad attention, but when it is applied to solve problems with large-scale discrete or contiguous state space environments, the results are likely to be unsatisfactory and even fail to find optimal policies. In order to solve this problem, we establish a new generative model about the value function and use Gaussian Process Regression to approximate the state-action pairs which were never or seldom visited. We testify to the performance of the proposed algorithm by an access-control queuing job in a cloud computing environment. The computational results demonstrate the scheme can balance the exploration and exploitation in the learning process and accelerate the convergence to a certain extent.","PeriodicalId":410477,"journal":{"name":"2016 IEEE 3rd International Conference on Cyber Security and Cloud Computing (CSCloud)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE 3rd International Conference on Cyber Security and Cloud Computing (CSCloud)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSCloud.2016.15","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Reinforcement learning is an area of machine learning inspired by behaviorist psychology, concerned with how software agents ought to take actions in an environment so as to maximize some notion of cumulative reward. Recently reinforcement learning has been given abroad attention, but when it is applied to solve problems with large-scale discrete or contiguous state space environments, the results are likely to be unsatisfactory and even fail to find optimal policies. In order to solve this problem, we establish a new generative model about the value function and use Gaussian Process Regression to approximate the state-action pairs which were never or seldom visited. We testify to the performance of the proposed algorithm by an access-control queuing job in a cloud computing environment. The computational results demonstrate the scheme can balance the exploration and exploitation in the learning process and accelerate the convergence to a certain extent.