Thomas M. Moerland, Joost Broekens, Aske Plaat, Catholijn M. Jonker
{"title":"Model-based Reinforcement Learning: A Survey","authors":"Thomas M. Moerland, Joost Broekens, Aske Plaat, Catholijn M. Jonker","doi":"10.1561/2200000086","DOIUrl":"https://doi.org/10.1561/2200000086","url":null,"abstract":"Sequential decision making, commonly formalized as Markov Decision Process (MDP) optimization, is an important challenge in artificial intelligence. Two key approaches to this problem are reinforcement learning (RL) and planning. This monograph surveys an integration of both fields, better known as model-based reinforcement learning. Model-based RL has two main steps: dynamics model learning and planning-learning integration. In this comprehensive survey of the topic, the authors first cover dynamics model learning, including challenges such as dealing with stochasticity, uncertainty, partial observability, and temporal abstraction. They then present a systematic categorization of planning-learning integration, including aspects such as: where to start planning, what budgets to allocate to planning and real data collection, how to plan, and how to integrate planning in the learning and acting loop. In conclusion the authors discuss implicit model-based RL as an end-to-end alternative for model learning and planning, and cover the potential benefits of model-based RL. Along the way, the authors draw connections to several related RL fields, including hierarchical RL and transfer learning. This monograph contains a broad conceptual overview of the combination of planning and learning for Markov Decision Process optimization. It provides a clear and complete introduction to the topic for students and researchers alike.","PeriodicalId":47667,"journal":{"name":"Foundations and Trends in Machine Learning","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135799442","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Probabilistic Learning","authors":"T. Jo","doi":"10.1007/978-3-030-65900-4_6","DOIUrl":"https://doi.org/10.1007/978-3-030-65900-4_6","url":null,"abstract":"","PeriodicalId":47667,"journal":{"name":"Foundations and Trends in Machine Learning","volume":"4 1","pages":""},"PeriodicalIF":32.8,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75335216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Data Encoding","authors":"T. Jo","doi":"10.1007/978-3-030-65900-4_3","DOIUrl":"https://doi.org/10.1007/978-3-030-65900-4_3","url":null,"abstract":"","PeriodicalId":47667,"journal":{"name":"Foundations and Trends in Machine Learning","volume":"1 1","pages":""},"PeriodicalIF":32.8,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88609756","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Numerical Vectors","authors":"T. Jo","doi":"10.1007/978-3-030-65900-4_2","DOIUrl":"https://doi.org/10.1007/978-3-030-65900-4_2","url":null,"abstract":"","PeriodicalId":47667,"journal":{"name":"Foundations and Trends in Machine Learning","volume":"108 1","pages":""},"PeriodicalIF":32.8,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86992067","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Advanced Clustering","authors":"T. Jo","doi":"10.1007/978-3-030-65900-4_12","DOIUrl":"https://doi.org/10.1007/978-3-030-65900-4_12","url":null,"abstract":"","PeriodicalId":47667,"journal":{"name":"Foundations and Trends in Machine Learning","volume":"57 1","pages":""},"PeriodicalIF":32.8,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82258178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Temporal Learning","authors":"T. Jo","doi":"10.1007/978-3-030-65900-4_15","DOIUrl":"https://doi.org/10.1007/978-3-030-65900-4_15","url":null,"abstract":"","PeriodicalId":47667,"journal":{"name":"Foundations and Trends in Machine Learning","volume":"45 1","pages":""},"PeriodicalIF":32.8,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85906006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Semi-supervised Learning","authors":"T. Jo","doi":"10.1007/978-3-030-65900-4_14","DOIUrl":"https://doi.org/10.1007/978-3-030-65900-4_14","url":null,"abstract":"","PeriodicalId":47667,"journal":{"name":"Foundations and Trends in Machine Learning","volume":"9 1","pages":""},"PeriodicalIF":32.8,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84571505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Reinforcement Learning","authors":"T. Jo","doi":"10.1007/978-3-030-65900-4_16","DOIUrl":"https://doi.org/10.1007/978-3-030-65900-4_16","url":null,"abstract":"","PeriodicalId":47667,"journal":{"name":"Foundations and Trends in Machine Learning","volume":"364 1","pages":""},"PeriodicalIF":32.8,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80304699","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yuxin Yang, Ben Gremillion, Xitong Zhang, Youzuo Lin, B. Wohlberg, Qiang Guan
{"title":"How Good Is Your Scientific Data Generative Model?","authors":"Yuxin Yang, Ben Gremillion, Xitong Zhang, Youzuo Lin, B. Wohlberg, Qiang Guan","doi":"10.1109/MLHPCAI4S51975.2020.00018","DOIUrl":"https://doi.org/10.1109/MLHPCAI4S51975.2020.00018","url":null,"abstract":"Nowadays, leveraging data augmentation methods on helping resolving scientific problems becomes prevailing. And many scientific problems benefit from data augmentation methods build with deep generative models. Yet due to the complexity of the scientific data, commonly used evaluation methods of generative models appear not so suitable for generated scientific data. In this paper, we explore how do we effectively evaluate data augmentation methods for scientific data generative models? To answer this question, we use one example of real world scientific problem to show how we evaluate the quality of the generated data from two domain specific deep generative models. We observe that most existing state-of-art evaluation metrics are incompetent. They either show completely contradicting results or provide inaccurate insight from real data.","PeriodicalId":47667,"journal":{"name":"Foundations and Trends in Machine Learning","volume":"56 1","pages":"96-102"},"PeriodicalIF":32.8,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75362366","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}