{"title":"基于强化学习的差异化业务网络服务质量动态带宽分配","authors":"C. Tham, T. Hui","doi":"10.1109/ICON.2003.1266241","DOIUrl":null,"url":null,"abstract":"The issue of bandwidth provisioning for Per Hop Behavior (PHB) aggregates in Differentiated Services (DiffServ) networks is imperative for differentiated QoS to be achieved. This paper proposes an adaptive provisioning scheme that determines at regular intervals the amount of bandwidth to provision for each PHB aggregate, based on traffic conditions and feedback received about the extent to which QoS is being met. The scheme adjusts parameters to minimize a penalty function that is based on the QoS requirements agreed upon in the service level agreement (SLA). The novel use of a continuous-space, gradient-descent reinforcement learning algorithm enables the scheme to work effectively without accurate traffic characterization or any assumption about the network model. Using ns-2 simulations, we show that the algorithm is able to converge to a policy that provisions bandwidth such that QoS requirements are satisfied.","PeriodicalId":122389,"journal":{"name":"The 11th IEEE International Conference on Networks, 2003. ICON2003.","volume":"44 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2003-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":"{\"title\":\"Reinforcement learning-based dynamic bandwidth provisioning for quality of service in differentiated services networks\",\"authors\":\"C. Tham, T. Hui\",\"doi\":\"10.1109/ICON.2003.1266241\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The issue of bandwidth provisioning for Per Hop Behavior (PHB) aggregates in Differentiated Services (DiffServ) networks is imperative for differentiated QoS to be achieved. This paper proposes an adaptive provisioning scheme that determines at regular intervals the amount of bandwidth to provision for each PHB aggregate, based on traffic conditions and feedback received about the extent to which QoS is being met. The scheme adjusts parameters to minimize a penalty function that is based on the QoS requirements agreed upon in the service level agreement (SLA). The novel use of a continuous-space, gradient-descent reinforcement learning algorithm enables the scheme to work effectively without accurate traffic characterization or any assumption about the network model. Using ns-2 simulations, we show that the algorithm is able to converge to a policy that provisions bandwidth such that QoS requirements are satisfied.\",\"PeriodicalId\":122389,\"journal\":{\"name\":\"The 11th IEEE International Conference on Networks, 2003. ICON2003.\",\"volume\":\"44 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2003-09-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"The 11th IEEE International Conference on Networks, 2003. ICON2003.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICON.2003.1266241\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"The 11th IEEE International Conference on Networks, 2003. ICON2003.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICON.2003.1266241","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Reinforcement learning-based dynamic bandwidth provisioning for quality of service in differentiated services networks
The issue of bandwidth provisioning for Per Hop Behavior (PHB) aggregates in Differentiated Services (DiffServ) networks is imperative for differentiated QoS to be achieved. This paper proposes an adaptive provisioning scheme that determines at regular intervals the amount of bandwidth to provision for each PHB aggregate, based on traffic conditions and feedback received about the extent to which QoS is being met. The scheme adjusts parameters to minimize a penalty function that is based on the QoS requirements agreed upon in the service level agreement (SLA). The novel use of a continuous-space, gradient-descent reinforcement learning algorithm enables the scheme to work effectively without accurate traffic characterization or any assumption about the network model. Using ns-2 simulations, we show that the algorithm is able to converge to a policy that provisions bandwidth such that QoS requirements are satisfied.