{"title":"未折现马尔可夫决策过程的最优性方程","authors":"M. Puterman","doi":"10.1109/CDC.1999.832928","DOIUrl":null,"url":null,"abstract":"We explore properties of the average and bias optimality equations in unichain Markov decision processes. We show that in unichain models, these equations have the same form, so that theory for gain optimality carries over to bias optimality.","PeriodicalId":137513,"journal":{"name":"Proceedings of the 38th IEEE Conference on Decision and Control (Cat. No.99CH36304)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1999-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Optimality equations in undiscounted Markov decision processes\",\"authors\":\"M. Puterman\",\"doi\":\"10.1109/CDC.1999.832928\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We explore properties of the average and bias optimality equations in unichain Markov decision processes. We show that in unichain models, these equations have the same form, so that theory for gain optimality carries over to bias optimality.\",\"PeriodicalId\":137513,\"journal\":{\"name\":\"Proceedings of the 38th IEEE Conference on Decision and Control (Cat. No.99CH36304)\",\"volume\":\"67 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1999-12-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 38th IEEE Conference on Decision and Control (Cat. No.99CH36304)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CDC.1999.832928\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 38th IEEE Conference on Decision and Control (Cat. No.99CH36304)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CDC.1999.832928","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Optimality equations in undiscounted Markov decision processes
We explore properties of the average and bias optimality equations in unichain Markov decision processes. We show that in unichain models, these equations have the same form, so that theory for gain optimality carries over to bias optimality.