Long Short-Term Memory with Slower Information Decay

LatinX in AI at International Conference on Machine Learning 2021 Pub Date : 2022-12-14 DOI:10.52591/2021072418

H. Chien, Javier Turek, Nicole M. Beckage, Vy A. Vo, C. Honey, Ted L. Willke

引用次数: 0

Abstract

Learning to process long-range dependencies has been a challenge for recurrent neural networks. Despite improvements achieved by long shortterm memory (LSTMs), its gating mechanism results in exponential decay of information, limiting their capacity of capturing long-range dependencies. In this work, we present a power law forget gate, which instead has a slower rate of information decay. We propose a power law-based LSTM (pLSTM) based on the LSTM but with a power law forget gate. We test empirically the pLSTM on the copy task, sentiment classification, and sequential MNIST, all with long-range dependency tasks. The pLSTM solves these tasks outperforming an LSTM, specially for long-range dependencies. Further, the pLSTM learns sparser and more robust representations.

查看原文本刊更多论文

具有较慢信息衰减的长短期记忆

学习处理远程依赖关系一直是递归神经网络面临的挑战。尽管长短期记忆(LSTMs)取得了进步，但其门控机制导致信息呈指数衰减，限制了它们捕获长期依赖关系的能力。在这项工作中，我们提出了一个幂律遗忘门，它具有较慢的信息衰减速率。我们提出了一种基于幂律的LSTM (pLSTM)，它在LSTM的基础上增加了幂律遗忘门。我们在复制任务、情感分类和顺序MNIST上对pLSTM进行了经验测试，所有这些都具有远程依赖任务。pLSTM比LSTM更好地解决了这些任务，特别是对于远程依赖项。此外，pLSTM学习更稀疏和更健壮的表示。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

LatinX in AI at International Conference on Machine Learning 2021

自引率

0.00%

发文量