An end-to-end optimized feature specific data imputation for recurrent neural networks under missing data

IF 2.9 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC
Safa Onur Sahin , Suleyman Serdar Kozat
{"title":"An end-to-end optimized feature specific data imputation for recurrent neural networks under missing data","authors":"Safa Onur Sahin ,&nbsp;Suleyman Serdar Kozat","doi":"10.1016/j.dsp.2025.105349","DOIUrl":null,"url":null,"abstract":"<div><div>We investigate regression and classification of time series under missing data, which happens in most real-life applications and severely degrades the performance of most, if not all, machine learning algorithms. We introduce a novel missing-valued time series processing algorithm involving a set of different imputation models to complete these missing values. We formulate the imputation selection in a multi-armed bandit framework, where imputation functions are selected specifically for each feature. Particularly, we simultaneously select an imputation model for each feature/component of the input vector among a set of imputation algorithms and train these imputation models along with the network for the target task in an end-to-end manner. Since the individual features may have widely distinct characteristics and temporal behaviors, a single imputation algorithm may show less than adequate performance for the imputation of all of the features. Our method is generic so that the set of imputation models can straightforwardly be extended by the additional imputation methods, and is also equally applicable to recurrent neural network architectures, even when the data arrival times of the feature vectors are non-uniform. In our experiments, we achieved significant performance improvements with respect to the state-of-the-art methods in well-known real-life datasets under different missing data regimes.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"166 ","pages":"Article 105349"},"PeriodicalIF":2.9000,"publicationDate":"2025-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital Signal Processing","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1051200425003719","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

Abstract

We investigate regression and classification of time series under missing data, which happens in most real-life applications and severely degrades the performance of most, if not all, machine learning algorithms. We introduce a novel missing-valued time series processing algorithm involving a set of different imputation models to complete these missing values. We formulate the imputation selection in a multi-armed bandit framework, where imputation functions are selected specifically for each feature. Particularly, we simultaneously select an imputation model for each feature/component of the input vector among a set of imputation algorithms and train these imputation models along with the network for the target task in an end-to-end manner. Since the individual features may have widely distinct characteristics and temporal behaviors, a single imputation algorithm may show less than adequate performance for the imputation of all of the features. Our method is generic so that the set of imputation models can straightforwardly be extended by the additional imputation methods, and is also equally applicable to recurrent neural network architectures, even when the data arrival times of the feature vectors are non-uniform. In our experiments, we achieved significant performance improvements with respect to the state-of-the-art methods in well-known real-life datasets under different missing data regimes.
缺失数据下递归神经网络的端到端优化特征数据输入
我们研究了缺失数据下时间序列的回归和分类,这种情况发生在大多数实际应用中,并且严重降低了大多数(如果不是全部的话)机器学习算法的性能。我们引入了一种新的缺失值时间序列处理算法,该算法涉及一组不同的输入模型来完成这些缺失值。我们在多臂强盗框架中制定了imputation选择,其中针对每个特征选择了imputation函数。特别是,我们同时为输入向量的每个特征/组件在一组输入算法中选择一个输入模型,并以端到端的方式与网络一起训练这些输入模型。由于单个特征可能具有广泛不同的特征和时间行为,因此单一的imputation算法可能无法表现出足够的性能。我们的方法是通用的,因此可以通过附加的输入方法直接扩展输入模型集,并且同样适用于递归神经网络架构,即使在特征向量的数据到达时间不均匀的情况下。在我们的实验中,我们在不同缺失数据制度下,相对于已知的现实数据集中最先进的方法,取得了显着的性能改进。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Digital Signal Processing
Digital Signal Processing 工程技术-工程:电子与电气
CiteScore
5.30
自引率
17.20%
发文量
435
审稿时长
66 days
期刊介绍: Digital Signal Processing: A Review Journal is one of the oldest and most established journals in the field of signal processing yet it aims to be the most innovative. The Journal invites top quality research articles at the frontiers of research in all aspects of signal processing. Our objective is to provide a platform for the publication of ground-breaking research in signal processing with both academic and industrial appeal. The journal has a special emphasis on statistical signal processing methodology such as Bayesian signal processing, and encourages articles on emerging applications of signal processing such as: • big data• machine learning• internet of things• information security• systems biology and computational biology,• financial time series analysis,• autonomous vehicles,• quantum computing,• neuromorphic engineering,• human-computer interaction and intelligent user interfaces,• environmental signal processing,• geophysical signal processing including seismic signal processing,• chemioinformatics and bioinformatics,• audio, visual and performance arts,• disaster management and prevention,• renewable energy,
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信