Nonlinear Regression With Hierarchical Recurrent Neural Networks Under Missing Data

S. Onur Sahin;Suleyman S. Kozat
{"title":"Nonlinear Regression With Hierarchical Recurrent Neural Networks Under Missing Data","authors":"S. Onur Sahin;Suleyman S. Kozat","doi":"10.1109/TAI.2024.3404414","DOIUrl":null,"url":null,"abstract":"We study regression (or prediction) of sequential data, which may have missing entries and/or different lengths. This problem is heavily investigated in the machine learning literature since such missingness is a common occurrence in most real-life applications due to data corruption, measurement errors, and similar. To this end, we introduce a novel hierarchical architecture involving a set of long short-term memory (LSTM) networks, which use only the existing inputs in the sequence without any imputations or statistical assumptions on the missing data. To incorporate the missingness information, we partition the input space into different regions in a hierarchical manner based on the “presence-pattern” of the previous inputs and then assign different LSTM networks to these regions. In this sense, we use the LSTM networks as our experts for these regions and adaptively combine their outputs to generate our final output. Our method is generic so that the set of partitioned regions (presence-patterns) that are modeled by the LSTM networks can be customized, and one can readily use other sequential architectures such as gated recurrent unit (GRU) networks and recurrent neural networks (RNNs) as shown in the article. We also provide the computational complexity analysis of the proposed architecture, which is in the same order as a conventional LSTM architecture. In our experiments, our algorithm achieves significant performance improvements on the well-known financial and real-life datasets with respect to the state-of-the-art methods. We also share the source code of our algorithm to facilitate other research and the replicability of our results.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on artificial intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10536892/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

We study regression (or prediction) of sequential data, which may have missing entries and/or different lengths. This problem is heavily investigated in the machine learning literature since such missingness is a common occurrence in most real-life applications due to data corruption, measurement errors, and similar. To this end, we introduce a novel hierarchical architecture involving a set of long short-term memory (LSTM) networks, which use only the existing inputs in the sequence without any imputations or statistical assumptions on the missing data. To incorporate the missingness information, we partition the input space into different regions in a hierarchical manner based on the “presence-pattern” of the previous inputs and then assign different LSTM networks to these regions. In this sense, we use the LSTM networks as our experts for these regions and adaptively combine their outputs to generate our final output. Our method is generic so that the set of partitioned regions (presence-patterns) that are modeled by the LSTM networks can be customized, and one can readily use other sequential architectures such as gated recurrent unit (GRU) networks and recurrent neural networks (RNNs) as shown in the article. We also provide the computational complexity analysis of the proposed architecture, which is in the same order as a conventional LSTM architecture. In our experiments, our algorithm achieves significant performance improvements on the well-known financial and real-life datasets with respect to the state-of-the-art methods. We also share the source code of our algorithm to facilitate other research and the replicability of our results.
缺失数据下的分层递归神经网络非线性回归
我们研究的是序列数据的回归(或预测)问题,这些数据可能存在条目缺失和/或长度不同的情况。机器学习文献对这一问题进行了大量研究,因为由于数据损坏、测量误差等类似原因,这种缺失在大多数现实应用中都很常见。为此,我们引入了一种涉及一组长短期记忆(LSTM)网络的新型分层架构,该架构只使用序列中的现有输入,而不对缺失数据进行任何推算或统计假设。为了纳入缺失信息,我们根据之前输入的 "存在模式",以分层方式将输入空间划分为不同区域,然后为这些区域分配不同的 LSTM 网络。从这个意义上说,我们将 LSTM 网络作为这些区域的专家,并自适应地组合它们的输出来生成我们的最终输出。我们的方法是通用的,因此可以定制 LSTM 网络建模的分区(存在模式)集,也可以随时使用其他序列架构,如文章中所示的门控递归单元 (GRU) 网络和递归神经网络 (RNN)。我们还提供了所提架构的计算复杂度分析,其计算复杂度与传统 LSTM 架构的计算复杂度相同。在实验中,与最先进的方法相比,我们的算法在著名的金融和现实生活数据集上取得了显著的性能提升。我们还分享了我们算法的源代码,以促进其他研究和我们成果的可复制性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
7.70
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信