深度神经网络的移位空间-频谱卷积

Proceedings of the ACM Multimedia Asia Pub Date : 2019-12-15 DOI:10.1145/3338533.3366575

Yuhao Xu, Hideki Nakayama

{"title":"深度神经网络的移位空间-频谱卷积","authors":"Yuhao Xu, Hideki Nakayama","doi":"10.1145/3338533.3366575","DOIUrl":null,"url":null,"abstract":"Deep convolutional neural networks (CNNs) extract local features and learn spatial representations via convolutions in the spatial domain. Beyond the spatial information, some works also manage to capture the spectral information in the frequency domain by domain switching methods like discrete Fourier transform (DFT) and discrete cosine transform (DCT). However, most works only pay attention to a single domain, which is prone to ignoring other important features. In this work, we propose a novel network structure to combine spatial and spectral convolutions, and extract features in both spatial and frequency domains. The input channels are divided into two groups for spatial and spectral representations respectively, and then integrated for feature fusion. Meanwhile, we design a channel-shifting mechanism to ensure both spatial and spectral information of every channel are equally and adequately obtained throughout the deep networks. Experimental results demonstrate that compared with state-of-the-art CNN models in a single domain, our shifted spatial-spectral convolution based networks achieve better performance on image classification datasets including CIFAR10, CIFAR100 and SVHN, with considerably fewer parameters.","PeriodicalId":273086,"journal":{"name":"Proceedings of the ACM Multimedia Asia","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Shifted Spatial-Spectral Convolution for Deep Neural Networks\",\"authors\":\"Yuhao Xu, Hideki Nakayama\",\"doi\":\"10.1145/3338533.3366575\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep convolutional neural networks (CNNs) extract local features and learn spatial representations via convolutions in the spatial domain. Beyond the spatial information, some works also manage to capture the spectral information in the frequency domain by domain switching methods like discrete Fourier transform (DFT) and discrete cosine transform (DCT). However, most works only pay attention to a single domain, which is prone to ignoring other important features. In this work, we propose a novel network structure to combine spatial and spectral convolutions, and extract features in both spatial and frequency domains. The input channels are divided into two groups for spatial and spectral representations respectively, and then integrated for feature fusion. Meanwhile, we design a channel-shifting mechanism to ensure both spatial and spectral information of every channel are equally and adequately obtained throughout the deep networks. Experimental results demonstrate that compared with state-of-the-art CNN models in a single domain, our shifted spatial-spectral convolution based networks achieve better performance on image classification datasets including CIFAR10, CIFAR100 and SVHN, with considerably fewer parameters.\",\"PeriodicalId\":273086,\"journal\":{\"name\":\"Proceedings of the ACM Multimedia Asia\",\"volume\":\"2 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-12-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the ACM Multimedia Asia\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3338533.3366575\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ACM Multimedia Asia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3338533.3366575","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

深度卷积神经网络(cnn)通过空间域的卷积提取局部特征并学习空间表征。除了空间信息之外，一些研究还设法通过离散傅立叶变换(DFT)和离散余弦变换(DCT)等域切换方法捕获频域的频谱信息。然而，大多数作品只关注一个领域，这很容易忽略其他重要的功能。在这项工作中，我们提出了一种新的网络结构来结合空间和频谱卷积，并在空间和频域提取特征。将输入通道分别分成两组进行空间表示和光谱表示，然后进行特征融合。同时，我们设计了一种信道移位机制，以确保在整个深度网络中每个信道的空间和频谱信息都得到平等和充分的获取。实验结果表明，与最先进的单一域CNN模型相比，我们基于移位空间光谱卷积的网络在图像分类数据集(包括CIFAR10, CIFAR100和SVHN)上具有更好的性能，且参数少得多。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Shifted Spatial-Spectral Convolution for Deep Neural Networks

Deep convolutional neural networks (CNNs) extract local features and learn spatial representations via convolutions in the spatial domain. Beyond the spatial information, some works also manage to capture the spectral information in the frequency domain by domain switching methods like discrete Fourier transform (DFT) and discrete cosine transform (DCT). However, most works only pay attention to a single domain, which is prone to ignoring other important features. In this work, we propose a novel network structure to combine spatial and spectral convolutions, and extract features in both spatial and frequency domains. The input channels are divided into two groups for spatial and spectral representations respectively, and then integrated for feature fusion. Meanwhile, we design a channel-shifting mechanism to ensure both spatial and spectral information of every channel are equally and adequately obtained throughout the deep networks. Experimental results demonstrate that compared with state-of-the-art CNN models in a single domain, our shifted spatial-spectral convolution based networks achieve better performance on image classification datasets including CIFAR10, CIFAR100 and SVHN, with considerably fewer parameters.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the ACM Multimedia Asia

自引率

0.00%

发文量