{"title":"非平衡学习与网络物理安全","authors":"K. Vamvoudakis, Aris Kanellopoulos","doi":"10.1109/ALLERTON.2019.8919756","DOIUrl":null,"url":null,"abstract":"This paper introduces a framework for non-equilibrium behavior analysis in cyber-physical systems for security purposes. To categorize the player, we employ the principles of reinforcement learning in order to derive an iterative method of optimal responses that determine the policy of an agent with level-$k$ intelligence in a general non-zerosum, nonlinear environment. For the special case of zero-sum, linear quadratic games we derive appropriate non-equilibrium game Riccati equations. To obviate the need for complete knowledge of the system dynamics, we employ a Q-learning algorithm as a best response solver. We then design an estimator that determines the distribution of intelligence levels in the adversarial environment of the system. Finally, simulation results showcase the efficacy of our approach.","PeriodicalId":120479,"journal":{"name":"2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Non-Equilibrium Learning and Cyber-Physical Security\",\"authors\":\"K. Vamvoudakis, Aris Kanellopoulos\",\"doi\":\"10.1109/ALLERTON.2019.8919756\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper introduces a framework for non-equilibrium behavior analysis in cyber-physical systems for security purposes. To categorize the player, we employ the principles of reinforcement learning in order to derive an iterative method of optimal responses that determine the policy of an agent with level-$k$ intelligence in a general non-zerosum, nonlinear environment. For the special case of zero-sum, linear quadratic games we derive appropriate non-equilibrium game Riccati equations. To obviate the need for complete knowledge of the system dynamics, we employ a Q-learning algorithm as a best response solver. We then design an estimator that determines the distribution of intelligence levels in the adversarial environment of the system. Finally, simulation results showcase the efficacy of our approach.\",\"PeriodicalId\":120479,\"journal\":{\"name\":\"2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton)\",\"volume\":\"14 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ALLERTON.2019.8919756\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ALLERTON.2019.8919756","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Non-Equilibrium Learning and Cyber-Physical Security
This paper introduces a framework for non-equilibrium behavior analysis in cyber-physical systems for security purposes. To categorize the player, we employ the principles of reinforcement learning in order to derive an iterative method of optimal responses that determine the policy of an agent with level-$k$ intelligence in a general non-zerosum, nonlinear environment. For the special case of zero-sum, linear quadratic games we derive appropriate non-equilibrium game Riccati equations. To obviate the need for complete knowledge of the system dynamics, we employ a Q-learning algorithm as a best response solver. We then design an estimator that determines the distribution of intelligence levels in the adversarial environment of the system. Finally, simulation results showcase the efficacy of our approach.