{"title":"A Time-Frequency Network with Channel Attention and Non-Local Modules for Artificial Bandwidth Extension","authors":"Yuanjie Dong, Yaxing Li, Xiaoqi Li, Shanjie Xu, Dan Wang, Zhihui Zhang, Shengwu Xiong","doi":"10.1109/ICASSP40776.2020.9053769","DOIUrl":null,"url":null,"abstract":"Convolution neural networks (CNNs) have been achieving increasing attention for the artificial bandwidth extension (ABE) task recently. However, these methods use the flipped low-frequency phase to reconstruct speech signals, which may lead to the well-known invalid short-time Fourier Transform (STFT) problem. The convolutional operations only enable networks to construct informative features by fusing both channel-wise and spatial information within local receptive fields at each layer. In this paper, we introduce a Time-Frequency Network (TFNet) with channel attention (CA) and non-local (NL) modules for ABE. The TFNet exploits the information from both time and frequency domain branches concurrently to avoid the invalid STFT problem. To capture the channels and space dependencies, we incorporate the CA and NL modules to construct a proposed fully convolutional neural network for the time and frequency branches of TFNet. Experimental results demonstrate that the proposed method outperforms the competing method.","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"12 1","pages":"6954-6958"},"PeriodicalIF":0.0000,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASSP40776.2020.9053769","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
Convolution neural networks (CNNs) have been achieving increasing attention for the artificial bandwidth extension (ABE) task recently. However, these methods use the flipped low-frequency phase to reconstruct speech signals, which may lead to the well-known invalid short-time Fourier Transform (STFT) problem. The convolutional operations only enable networks to construct informative features by fusing both channel-wise and spatial information within local receptive fields at each layer. In this paper, we introduce a Time-Frequency Network (TFNet) with channel attention (CA) and non-local (NL) modules for ABE. The TFNet exploits the information from both time and frequency domain branches concurrently to avoid the invalid STFT problem. To capture the channels and space dependencies, we incorporate the CA and NL modules to construct a proposed fully convolutional neural network for the time and frequency branches of TFNet. Experimental results demonstrate that the proposed method outperforms the competing method.