这个问题

Tolstoy on Aesthetics Pub Date : 2019-07-12 DOI:10.4324/9781315203225-3

H. Mounce

{"title":"这个问题","authors":"H. Mounce","doi":"10.4324/9781315203225-3","DOIUrl":null,"url":null,"abstract":"Learning online is essential for an agent to perform well in an ever-changing world. An agent has to learn online not only out of necessity — a non-stationary world might render past learning useless — but also because continual tracking in a temporally coherent world can result in better performance than a ﬁxed solution. Despite the necessity of online learning, we have made little progress towards building robust online learning methods. More speciﬁcally, a scalable online representation learning method for neural network function approximators has remained elusive. In this thesis, I investigate the reasons behind this lack of progress. I propose the idea of online-aware representations – data representations explicitly optimized for online learning – and argue that existing representation learning methods do not learn such representations. I investigate if neural networks are capable of learning these representations. My results suggest that neural networks can indeed learn representations that are highly eﬀective for online learning, but learning these representations online using gradient-based methods is challenging. More speciﬁcally, long-term credit assignment using back-propagation through time (BPTT) does not scale with the size of the problem. To address this, I propose Learning with Backtracking for slowly and continually improving representations online. The primary idea behind LwB is that while it is not possible to compute an accurate estimate of the representation update online, it is possible to verify if an update is useful online.","PeriodicalId":389393,"journal":{"name":"Tolstoy on Aesthetics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"The Problem\",\"authors\":\"H. Mounce\",\"doi\":\"10.4324/9781315203225-3\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Learning online is essential for an agent to perform well in an ever-changing world. An agent has to learn online not only out of necessity — a non-stationary world might render past learning useless — but also because continual tracking in a temporally coherent world can result in better performance than a ﬁxed solution. Despite the necessity of online learning, we have made little progress towards building robust online learning methods. More speciﬁcally, a scalable online representation learning method for neural network function approximators has remained elusive. In this thesis, I investigate the reasons behind this lack of progress. I propose the idea of online-aware representations – data representations explicitly optimized for online learning – and argue that existing representation learning methods do not learn such representations. I investigate if neural networks are capable of learning these representations. My results suggest that neural networks can indeed learn representations that are highly eﬀective for online learning, but learning these representations online using gradient-based methods is challenging. More speciﬁcally, long-term credit assignment using back-propagation through time (BPTT) does not scale with the size of the problem. To address this, I propose Learning with Backtracking for slowly and continually improving representations online. The primary idea behind LwB is that while it is not possible to compute an accurate estimate of the representation update online, it is possible to verify if an update is useful online.\",\"PeriodicalId\":389393,\"journal\":{\"name\":\"Tolstoy on Aesthetics\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-07-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Tolstoy on Aesthetics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.4324/9781315203225-3\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Tolstoy on Aesthetics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4324/9781315203225-3","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

在线学习对于智能体在不断变化的世界中表现良好至关重要。智能体必须在线学习，这不仅是出于必要——一个非固定的世界可能会使过去的学习变得无用——而且还因为在一个暂时连贯的世界中持续跟踪可能会比固定的解决方案产生更好的性能。尽管在线学习是必要的，但我们在建立健全的在线学习方法方面进展甚微。更具体地说，神经网络函数逼近器的可扩展在线表示学习方法仍然难以捉摸。在本文中，我研究了这种缺乏进展的原因。我提出了在线感知表征的概念——明确为在线学习优化的数据表征——并认为现有的表征学习方法不能学习这种表征。我研究神经网络是否有能力学习这些表征。我的研究结果表明，神经网络确实可以学习对在线学习非常有效的表示，但是使用基于梯度的方法在线学习这些表示是具有挑战性的。更具体地说，使用时间反向传播(BPTT)的长期信用分配不能随问题的规模而扩展。为了解决这个问题，我建议用回溯学习来缓慢而持续地改进在线表示。LwB背后的主要思想是，虽然不可能在线计算对表示更新的准确估计，但可以在线验证更新是否有用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

The Problem

Learning online is essential for an agent to perform well in an ever-changing world. An agent has to learn online not only out of necessity — a non-stationary world might render past learning useless — but also because continual tracking in a temporally coherent world can result in better performance than a ﬁxed solution. Despite the necessity of online learning, we have made little progress towards building robust online learning methods. More speciﬁcally, a scalable online representation learning method for neural network function approximators has remained elusive. In this thesis, I investigate the reasons behind this lack of progress. I propose the idea of online-aware representations – data representations explicitly optimized for online learning – and argue that existing representation learning methods do not learn such representations. I investigate if neural networks are capable of learning these representations. My results suggest that neural networks can indeed learn representations that are highly eﬀective for online learning, but learning these representations online using gradient-based methods is challenging. More speciﬁcally, long-term credit assignment using back-propagation through time (BPTT) does not scale with the size of the problem. To address this, I propose Learning with Backtracking for slowly and continually improving representations online. The primary idea behind LwB is that while it is not possible to compute an accurate estimate of the representation update online, it is possible to verify if an update is useful online.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Tolstoy on Aesthetics

自引率

0.00%

发文量