{"title":"基于遗传强化学习的交通信号协同控制","authors":"S. Mikami, Y. Kakazu","doi":"10.1109/ICEC.1994.350012","DOIUrl":null,"url":null,"abstract":"Optimization of a group of traffic signals over an area is a large, multi-agent-type real-time planning problem without a precise reference model being given. To do this planning, each signal should learn not only to acquire its control plans individually through reinforcement learning, but also to cooperate with other signals. These two objectives-distributed learning of agents and cooperation among agents-conflict with each other, and a method that blends these two objectives together is required. In the method proposed in this paper, these two objectives correspond to localized reinforcement learning and global combinatorial optimization, respectively, and the method thus achieves cooperation in the long term without bothering with autonomy. The outline of the idea is as follows: each agent performs reinforcement learning and reports its cumulative performance evaluation, and combinatorial optimization is simultaneously carried out to find appropriate parameters for long-term learning that maximize the total profit of the signals (agents).<<ETX>>","PeriodicalId":393865,"journal":{"name":"Proceedings of the First IEEE Conference on Evolutionary Computation. IEEE World Congress on Computational Intelligence","volume":"67 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1994-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"96","resultStr":"{\"title\":\"Genetic reinforcement learning for cooperative traffic signal control\",\"authors\":\"S. Mikami, Y. Kakazu\",\"doi\":\"10.1109/ICEC.1994.350012\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Optimization of a group of traffic signals over an area is a large, multi-agent-type real-time planning problem without a precise reference model being given. To do this planning, each signal should learn not only to acquire its control plans individually through reinforcement learning, but also to cooperate with other signals. These two objectives-distributed learning of agents and cooperation among agents-conflict with each other, and a method that blends these two objectives together is required. In the method proposed in this paper, these two objectives correspond to localized reinforcement learning and global combinatorial optimization, respectively, and the method thus achieves cooperation in the long term without bothering with autonomy. The outline of the idea is as follows: each agent performs reinforcement learning and reports its cumulative performance evaluation, and combinatorial optimization is simultaneously carried out to find appropriate parameters for long-term learning that maximize the total profit of the signals (agents).<<ETX>>\",\"PeriodicalId\":393865,\"journal\":{\"name\":\"Proceedings of the First IEEE Conference on Evolutionary Computation. IEEE World Congress on Computational Intelligence\",\"volume\":\"67 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1994-06-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"96\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the First IEEE Conference on Evolutionary Computation. IEEE World Congress on Computational Intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICEC.1994.350012\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the First IEEE Conference on Evolutionary Computation. IEEE World Congress on Computational Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICEC.1994.350012","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Genetic reinforcement learning for cooperative traffic signal control
Optimization of a group of traffic signals over an area is a large, multi-agent-type real-time planning problem without a precise reference model being given. To do this planning, each signal should learn not only to acquire its control plans individually through reinforcement learning, but also to cooperate with other signals. These two objectives-distributed learning of agents and cooperation among agents-conflict with each other, and a method that blends these two objectives together is required. In the method proposed in this paper, these two objectives correspond to localized reinforcement learning and global combinatorial optimization, respectively, and the method thus achieves cooperation in the long term without bothering with autonomy. The outline of the idea is as follows: each agent performs reinforcement learning and reports its cumulative performance evaluation, and combinatorial optimization is simultaneously carried out to find appropriate parameters for long-term learning that maximize the total profit of the signals (agents).<>