{"title":"A Multi-agent Reinforcement Learning Model for Service Composition","authors":"Hongbing Wang, Xiaojun Wang, Xuan Zhou","doi":"10.1109/SCC.2012.58","DOIUrl":null,"url":null,"abstract":"This paper describes a multi-agent reinforcement learning model for the optimization of Web service composition. Based on the model, we propose a multiagent Q-learning algorithm, where each agent would benefit from the advice of other agents in team. In contrast to single-agent reinforcement learning, our algorithm can speed up convergence to optimal policy. In addition, it allows composite service to dynamically adjust itself to fit the varying environment, where the properties of the component services continue changing. Our experiments demonstrate the efficiency of our algorithm.","PeriodicalId":178841,"journal":{"name":"2012 IEEE Ninth International Conference on Services Computing","volume":"97 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE Ninth International Conference on Services Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SCC.2012.58","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
This paper describes a multi-agent reinforcement learning model for the optimization of Web service composition. Based on the model, we propose a multiagent Q-learning algorithm, where each agent would benefit from the advice of other agents in team. In contrast to single-agent reinforcement learning, our algorithm can speed up convergence to optimal policy. In addition, it allows composite service to dynamically adjust itself to fit the varying environment, where the properties of the component services continue changing. Our experiments demonstrate the efficiency of our algorithm.