{"title":"An interpretable convolutional neural network via generalized time–frequency scattering","authors":"Xiaoping Liu , Gong Chen , Jun Shi , Ran Tao","doi":"10.1016/j.sigpro.2025.110043","DOIUrl":null,"url":null,"abstract":"<div><div>Convolutional neural networks (CNNs) have recently demonstrated impressive performance in complex machine learning tasks. However, the CNN requires a large quantity of annotated data to converge to a good solution, and the theoretical understanding of this network is still in its infancy. Towards this end, a variant of the CNN, dubbed the deep scattering network (DSN), has been proposed by employing the linear time–frequency transform. The DSN inherits the hierarchical structure of the CNN, but chooses predefined wavelet/Gabor filters as its convolutional kernels instead of data-driven linear filters. Unfortunately, the DSN suffers from a major drawback that it is suitable for stationary image textures but not for non-stationary image textures, since wavelet/Gabor filters are intrinsically linear translation-invariant filters. The aim of this paper is to overcome this deficiency based upon a generalized linear time–frequency transform–the short-time fractional Fourier transform (STFRFT) which can be interpreted as a bank of linear translation-variant filters and thus may be well suitable for non-stationary texture analysis. We first introduce a generalized time–frequency scattering transform using the STFRFT. By applying the derived result, we propose an interpretable CNN by cascading the STFRFTs and modulus operators. Moreover, several basic properties of the proposed interpretable CNN are derived, and an efficient implementation of this network is also presented. Finally, the applications of the derived results are discussed.</div></div>","PeriodicalId":49523,"journal":{"name":"Signal Processing","volume":"237 ","pages":"Article 110043"},"PeriodicalIF":3.4000,"publicationDate":"2025-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Signal Processing","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0165168425001574","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
Convolutional neural networks (CNNs) have recently demonstrated impressive performance in complex machine learning tasks. However, the CNN requires a large quantity of annotated data to converge to a good solution, and the theoretical understanding of this network is still in its infancy. Towards this end, a variant of the CNN, dubbed the deep scattering network (DSN), has been proposed by employing the linear time–frequency transform. The DSN inherits the hierarchical structure of the CNN, but chooses predefined wavelet/Gabor filters as its convolutional kernels instead of data-driven linear filters. Unfortunately, the DSN suffers from a major drawback that it is suitable for stationary image textures but not for non-stationary image textures, since wavelet/Gabor filters are intrinsically linear translation-invariant filters. The aim of this paper is to overcome this deficiency based upon a generalized linear time–frequency transform–the short-time fractional Fourier transform (STFRFT) which can be interpreted as a bank of linear translation-variant filters and thus may be well suitable for non-stationary texture analysis. We first introduce a generalized time–frequency scattering transform using the STFRFT. By applying the derived result, we propose an interpretable CNN by cascading the STFRFTs and modulus operators. Moreover, several basic properties of the proposed interpretable CNN are derived, and an efficient implementation of this network is also presented. Finally, the applications of the derived results are discussed.
期刊介绍:
Signal Processing incorporates all aspects of the theory and practice of signal processing. It features original research work, tutorial and review articles, and accounts of practical developments. It is intended for a rapid dissemination of knowledge and experience to engineers and scientists working in the research, development or practical application of signal processing.
Subject areas covered by the journal include: Signal Theory; Stochastic Processes; Detection and Estimation; Spectral Analysis; Filtering; Signal Processing Systems; Software Developments; Image Processing; Pattern Recognition; Optical Signal Processing; Digital Signal Processing; Multi-dimensional Signal Processing; Communication Signal Processing; Biomedical Signal Processing; Geophysical and Astrophysical Signal Processing; Earth Resources Signal Processing; Acoustic and Vibration Signal Processing; Data Processing; Remote Sensing; Signal Processing Technology; Radar Signal Processing; Sonar Signal Processing; Industrial Applications; New Applications.