{"title":"这个问题","authors":"H. Mounce","doi":"10.4324/9781315203225-3","DOIUrl":null,"url":null,"abstract":"Learning online is essential for an agent to perform well in an ever-changing world. An agent has to learn online not only out of necessity — a non-stationary world might render past learning useless — but also because continual tracking in a temporally coherent world can result in better performance than a fixed solution. Despite the necessity of online learning, we have made little progress towards building robust online learning methods. More specifically, a scalable online representation learning method for neural network function approximators has remained elusive. In this thesis, I investigate the reasons behind this lack of progress. I propose the idea of online-aware representations – data representations explicitly optimized for online learning – and argue that existing representation learning methods do not learn such representations. I investigate if neural networks are capable of learning these representations. My results suggest that neural networks can indeed learn representations that are highly effective for online learning, but learning these representations online using gradient-based methods is challenging. More specifically, long-term credit assignment using back-propagation through time (BPTT) does not scale with the size of the problem. To address this, I propose Learning with Backtracking for slowly and continually improving representations online. The primary idea behind LwB is that while it is not possible to compute an accurate estimate of the representation update online, it is possible to verify if an update is useful online.","PeriodicalId":389393,"journal":{"name":"Tolstoy on Aesthetics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"The Problem\",\"authors\":\"H. Mounce\",\"doi\":\"10.4324/9781315203225-3\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Learning online is essential for an agent to perform well in an ever-changing world. An agent has to learn online not only out of necessity — a non-stationary world might render past learning useless — but also because continual tracking in a temporally coherent world can result in better performance than a fixed solution. Despite the necessity of online learning, we have made little progress towards building robust online learning methods. More specifically, a scalable online representation learning method for neural network function approximators has remained elusive. In this thesis, I investigate the reasons behind this lack of progress. I propose the idea of online-aware representations – data representations explicitly optimized for online learning – and argue that existing representation learning methods do not learn such representations. I investigate if neural networks are capable of learning these representations. My results suggest that neural networks can indeed learn representations that are highly effective for online learning, but learning these representations online using gradient-based methods is challenging. More specifically, long-term credit assignment using back-propagation through time (BPTT) does not scale with the size of the problem. To address this, I propose Learning with Backtracking for slowly and continually improving representations online. The primary idea behind LwB is that while it is not possible to compute an accurate estimate of the representation update online, it is possible to verify if an update is useful online.\",\"PeriodicalId\":389393,\"journal\":{\"name\":\"Tolstoy on Aesthetics\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-07-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Tolstoy on Aesthetics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.4324/9781315203225-3\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Tolstoy on Aesthetics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4324/9781315203225-3","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Learning online is essential for an agent to perform well in an ever-changing world. An agent has to learn online not only out of necessity — a non-stationary world might render past learning useless — but also because continual tracking in a temporally coherent world can result in better performance than a fixed solution. Despite the necessity of online learning, we have made little progress towards building robust online learning methods. More specifically, a scalable online representation learning method for neural network function approximators has remained elusive. In this thesis, I investigate the reasons behind this lack of progress. I propose the idea of online-aware representations – data representations explicitly optimized for online learning – and argue that existing representation learning methods do not learn such representations. I investigate if neural networks are capable of learning these representations. My results suggest that neural networks can indeed learn representations that are highly effective for online learning, but learning these representations online using gradient-based methods is challenging. More specifically, long-term credit assignment using back-propagation through time (BPTT) does not scale with the size of the problem. To address this, I propose Learning with Backtracking for slowly and continually improving representations online. The primary idea behind LwB is that while it is not possible to compute an accurate estimate of the representation update online, it is possible to verify if an update is useful online.