多媒体网络的强化学习协同拥塞控制

2005 IEEE International Conference on Information Acquisition Pub Date : 1900-01-01 DOI:10.1109/ICIA.2005.1635085

Kao-Shing Hwang, Cheng-Shong Wu, Hui-Kai Su

{"title":"多媒体网络的强化学习协同拥塞控制","authors":"Kao-Shing Hwang, Cheng-Shong Wu, Hui-Kai Su","doi":"10.1109/ICIA.2005.1635085","DOIUrl":null,"url":null,"abstract":"A cooperative congestion control based on the learning approach to solve congestion control problems on multimedia networks is presented. The proposed controller, which is capable of rate-based predictive control, consists of two sub-systems: a long-term policy critic and a short-term rate-adaptor. Each controller in a chained network jointly learns the control policy by real-time interactions without prior knowledge of a network model. Furthermore, a cooperative fuzzy reward evaluator provides cooperative reinforcement signals based on game theory to train controllers to adapt to dynamic network environment. The well-trained controllers can take correct actions adaptively to regulate source flow to simultaneously meet the requirements of high link utilization, low packet loss rate (PLR) and end-to-end delay. Simulation results show that the proposed approach is very effective in controlling congestion of the multimedia traffic in Internet networks.","PeriodicalId":136611,"journal":{"name":"2005 IEEE International Conference on Information Acquisition","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Reinforcement learning cooperative congestion control for multimedia networks\",\"authors\":\"Kao-Shing Hwang, Cheng-Shong Wu, Hui-Kai Su\",\"doi\":\"10.1109/ICIA.2005.1635085\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A cooperative congestion control based on the learning approach to solve congestion control problems on multimedia networks is presented. The proposed controller, which is capable of rate-based predictive control, consists of two sub-systems: a long-term policy critic and a short-term rate-adaptor. Each controller in a chained network jointly learns the control policy by real-time interactions without prior knowledge of a network model. Furthermore, a cooperative fuzzy reward evaluator provides cooperative reinforcement signals based on game theory to train controllers to adapt to dynamic network environment. The well-trained controllers can take correct actions adaptively to regulate source flow to simultaneously meet the requirements of high link utilization, low packet loss rate (PLR) and end-to-end delay. Simulation results show that the proposed approach is very effective in controlling congestion of the multimedia traffic in Internet networks.\",\"PeriodicalId\":136611,\"journal\":{\"name\":\"2005 IEEE International Conference on Information Acquisition\",\"volume\":\"32 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2005 IEEE International Conference on Information Acquisition\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICIA.2005.1635085\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2005 IEEE International Conference on Information Acquisition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICIA.2005.1635085","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

摘要

针对多媒体网络中的拥塞控制问题，提出了一种基于学习的协同拥塞控制方法。该控制器具有基于利率的预测控制能力，由两个子系统组成:一个长期政策批评系统和一个短期利率适应系统。链式网络中的每个控制器在不需要预先了解网络模型的情况下，通过实时交互共同学习控制策略。在此基础上，利用基于博弈论的协同模糊奖励评估器提供协同强化信号，训练控制器适应动态网络环境。经过良好训练的控制器能够自适应调整源流，同时满足高链路利用率、低PLR (packet loss rate)和端到端时延的要求。仿真结果表明，该方法对Internet网络中多媒体流量的拥塞控制非常有效。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Reinforcement learning cooperative congestion control for multimedia networks

A cooperative congestion control based on the learning approach to solve congestion control problems on multimedia networks is presented. The proposed controller, which is capable of rate-based predictive control, consists of two sub-systems: a long-term policy critic and a short-term rate-adaptor. Each controller in a chained network jointly learns the control policy by real-time interactions without prior knowledge of a network model. Furthermore, a cooperative fuzzy reward evaluator provides cooperative reinforcement signals based on game theory to train controllers to adapt to dynamic network environment. The well-trained controllers can take correct actions adaptively to regulate source flow to simultaneously meet the requirements of high link utilization, low packet loss rate (PLR) and end-to-end delay. Simulation results show that the proposed approach is very effective in controlling congestion of the multimedia traffic in Internet networks.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2005 IEEE International Conference on Information Acquisition

自引率

0.00%

发文量