Video-Tensor Completion using a Deep Learning approach

2020 IEEE Colombian Conference on Applications of Computational Intelligence (IEEE ColCACI 2020) Pub Date : 2020-08-07 DOI:10.1109/ColCACI50549.2020.9247929

Paula Arguello, David Morales, Y. Fonseca, H. Arguello

{"title":"Video-Tensor Completion using a Deep Learning approach","authors":"Paula Arguello, David Morales, Y. Fonseca, H. Arguello","doi":"10.1109/ColCACI50549.2020.9247929","DOIUrl":null,"url":null,"abstract":"The tensor completion problem solves the recovery of corrupted data in a multidimensional array named as a tensor. The traditional approaches in tensor completion are based on the transform tensor singular value decomposition(tt-SVD). These approaches minimize the tensor nuclear norm in a domain of an orthogonal transformation to induce low tensorial rank representation. Hence, they require previous knowledge of the data to ensure a low tensor rank representation and, therefore, to ensure a good quality reconstruction. On the other hand, based on the wide progress of deep learning in diverse contexts, this paper presents a 3DU-Net architecture for tensor data recovery in the problem of grayscale videos. The proposed method consists of convolutional layers with 3D filters to take advantage of the information at the spatio-temporal dimensions. The experimental results show that the proposed method has better performance in relative error (RE), peak-to-signal-ratio (PSNR), and less runtime compared with the state-of-the-art solutions. In particular, in the presence of noise, our proposed approach improves the recovery in up to 5.99 dB, and 0.09 in the RE with an 85% of corrupted pixels. In the noiseless case, the proposed architecture improves in 4.39 dB and 0.07 in the RE, when an 85% of the data is lost. Furthermore, the proposed method shows to be faster than the state-of-the-art in the reconstruction time in at least 2.5 times.","PeriodicalId":446750,"journal":{"name":"2020 IEEE Colombian Conference on Applications of Computational Intelligence (IEEE ColCACI 2020)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE Colombian Conference on Applications of Computational Intelligence (IEEE ColCACI 2020)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ColCACI50549.2020.9247929","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

The tensor completion problem solves the recovery of corrupted data in a multidimensional array named as a tensor. The traditional approaches in tensor completion are based on the transform tensor singular value decomposition(tt-SVD). These approaches minimize the tensor nuclear norm in a domain of an orthogonal transformation to induce low tensorial rank representation. Hence, they require previous knowledge of the data to ensure a low tensor rank representation and, therefore, to ensure a good quality reconstruction. On the other hand, based on the wide progress of deep learning in diverse contexts, this paper presents a 3DU-Net architecture for tensor data recovery in the problem of grayscale videos. The proposed method consists of convolutional layers with 3D filters to take advantage of the information at the spatio-temporal dimensions. The experimental results show that the proposed method has better performance in relative error (RE), peak-to-signal-ratio (PSNR), and less runtime compared with the state-of-the-art solutions. In particular, in the presence of noise, our proposed approach improves the recovery in up to 5.99 dB, and 0.09 in the RE with an 85% of corrupted pixels. In the noiseless case, the proposed architecture improves in 4.39 dB and 0.07 in the RE, when an 85% of the data is lost. Furthermore, the proposed method shows to be faster than the state-of-the-art in the reconstruction time in at least 2.5 times.

查看原文本刊更多论文

使用深度学习方法的视频张量补全

张量补全问题解决了被称为张量的多维数组中损坏数据的恢复。传统的张量补全方法是基于变换张量奇异值分解(tt-SVD)。这些方法最小化正交变换域中的张量核范数，从而得到低张量秩表示。因此，它们需要预先了解数据，以确保低张量秩表示，从而确保高质量的重建。另一方面，基于深度学习在不同背景下的广泛进展，本文提出了一种用于灰度视频问题中张量数据恢复的3DU-Net架构。该方法由带有三维滤波器的卷积层组成，以充分利用时空维度的信息。实验结果表明，与现有方法相比，该方法在相对误差(RE)、峰信比(PSNR)和运行时间方面都有较好的性能。特别是，在存在噪声的情况下，我们提出的方法在85%的损坏像素的情况下，将恢复提高到5.99 dB，在RE中提高到0.09。在无噪声情况下，当85%的数据丢失时，所提出的架构在RE中提高了4.39 dB和0.07 dB。此外，该方法在重建时间上比现有方法快至少2.5倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2020 IEEE Colombian Conference on Applications of Computational Intelligence (IEEE ColCACI 2020)

自引率

0.00%

发文量