{"title":"A New Sample-Efficient PAC Reinforcement Learning Algorithm","authors":"A. Zehfroosh, H. Tanner","doi":"10.1109/MED48518.2020.9182985","DOIUrl":null,"url":null,"abstract":"This paper introduces a new hybrid PAC RL algorithm for MDPS, which intelligently maintains favorable features of its parents. The DDQ algorithm, integrates model-free and model-based learning approaches, preserving some advantages from both. A PAC analysis of the DDQ algorithm is presented and its sample complexity is explicitly bounded. Numerical results from a small-scale example motivated by work on human-robot interaction models corroborates the theoretical predictions on sample complexity.","PeriodicalId":418518,"journal":{"name":"2020 28th Mediterranean Conference on Control and Automation (MED)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 28th Mediterranean Conference on Control and Automation (MED)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MED48518.2020.9182985","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This paper introduces a new hybrid PAC RL algorithm for MDPS, which intelligently maintains favorable features of its parents. The DDQ algorithm, integrates model-free and model-based learning approaches, preserving some advantages from both. A PAC analysis of the DDQ algorithm is presented and its sample complexity is explicitly bounded. Numerical results from a small-scale example motivated by work on human-robot interaction models corroborates the theoretical predictions on sample complexity.