这个问题

H. Mounce
{"title":"这个问题","authors":"H. Mounce","doi":"10.4324/9781315203225-3","DOIUrl":null,"url":null,"abstract":"Learning online is essential for an agent to perform well in an ever-changing world. An agent has to learn online not only out of necessity — a non-stationary world might render past learning useless — but also because continual tracking in a temporally coherent world can result in better performance than a fixed solution. Despite the necessity of online learning, we have made little progress towards building robust online learning methods. More specifically, a scalable online representation learning method for neural network function approximators has remained elusive. In this thesis, I investigate the reasons behind this lack of progress. I propose the idea of online-aware representations – data representations explicitly optimized for online learning – and argue that existing representation learning methods do not learn such representations. I investigate if neural networks are capable of learning these representations. My results suggest that neural networks can indeed learn representations that are highly effective for online learning, but learning these representations online using gradient-based methods is challenging. More specifically, long-term credit assignment using back-propagation through time (BPTT) does not scale with the size of the problem. To address this, I propose Learning with Backtracking for slowly and continually improving representations online. The primary idea behind LwB is that while it is not possible to compute an accurate estimate of the representation update online, it is possible to verify if an update is useful online.","PeriodicalId":389393,"journal":{"name":"Tolstoy on Aesthetics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"The Problem\",\"authors\":\"H. Mounce\",\"doi\":\"10.4324/9781315203225-3\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Learning online is essential for an agent to perform well in an ever-changing world. An agent has to learn online not only out of necessity — a non-stationary world might render past learning useless — but also because continual tracking in a temporally coherent world can result in better performance than a fixed solution. Despite the necessity of online learning, we have made little progress towards building robust online learning methods. More specifically, a scalable online representation learning method for neural network function approximators has remained elusive. In this thesis, I investigate the reasons behind this lack of progress. I propose the idea of online-aware representations – data representations explicitly optimized for online learning – and argue that existing representation learning methods do not learn such representations. I investigate if neural networks are capable of learning these representations. My results suggest that neural networks can indeed learn representations that are highly effective for online learning, but learning these representations online using gradient-based methods is challenging. More specifically, long-term credit assignment using back-propagation through time (BPTT) does not scale with the size of the problem. To address this, I propose Learning with Backtracking for slowly and continually improving representations online. The primary idea behind LwB is that while it is not possible to compute an accurate estimate of the representation update online, it is possible to verify if an update is useful online.\",\"PeriodicalId\":389393,\"journal\":{\"name\":\"Tolstoy on Aesthetics\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-07-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Tolstoy on Aesthetics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.4324/9781315203225-3\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Tolstoy on Aesthetics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4324/9781315203225-3","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

在线学习对于智能体在不断变化的世界中表现良好至关重要。智能体必须在线学习,这不仅是出于必要——一个非固定的世界可能会使过去的学习变得无用——而且还因为在一个暂时连贯的世界中持续跟踪可能会比固定的解决方案产生更好的性能。尽管在线学习是必要的,但我们在建立健全的在线学习方法方面进展甚微。更具体地说,神经网络函数逼近器的可扩展在线表示学习方法仍然难以捉摸。在本文中,我研究了这种缺乏进展的原因。我提出了在线感知表征的概念——明确为在线学习优化的数据表征——并认为现有的表征学习方法不能学习这种表征。我研究神经网络是否有能力学习这些表征。我的研究结果表明,神经网络确实可以学习对在线学习非常有效的表示,但是使用基于梯度的方法在线学习这些表示是具有挑战性的。更具体地说,使用时间反向传播(BPTT)的长期信用分配不能随问题的规模而扩展。为了解决这个问题,我建议用回溯学习来缓慢而持续地改进在线表示。LwB背后的主要思想是,虽然不可能在线计算对表示更新的准确估计,但可以在线验证更新是否有用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
The Problem
Learning online is essential for an agent to perform well in an ever-changing world. An agent has to learn online not only out of necessity — a non-stationary world might render past learning useless — but also because continual tracking in a temporally coherent world can result in better performance than a fixed solution. Despite the necessity of online learning, we have made little progress towards building robust online learning methods. More specifically, a scalable online representation learning method for neural network function approximators has remained elusive. In this thesis, I investigate the reasons behind this lack of progress. I propose the idea of online-aware representations – data representations explicitly optimized for online learning – and argue that existing representation learning methods do not learn such representations. I investigate if neural networks are capable of learning these representations. My results suggest that neural networks can indeed learn representations that are highly effective for online learning, but learning these representations online using gradient-based methods is challenging. More specifically, long-term credit assignment using back-propagation through time (BPTT) does not scale with the size of the problem. To address this, I propose Learning with Backtracking for slowly and continually improving representations online. The primary idea behind LwB is that while it is not possible to compute an accurate estimate of the representation update online, it is possible to verify if an update is useful online.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信