ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)最新文献

筛选
英文 中文
Compressed Sensing Based Channel Estimation and Open-loop Training Design for Hybrid Analog-digital Massive MIMO Systems 基于压缩感知的混合模数大规模MIMO系统信道估计与开环训练设计
Khaled Ardah, Bruno Sokal, A. D. Almeida, M. Haardt
{"title":"Compressed Sensing Based Channel Estimation and Open-loop Training Design for Hybrid Analog-digital Massive MIMO Systems","authors":"Khaled Ardah, Bruno Sokal, A. D. Almeida, M. Haardt","doi":"10.1109/ICASSP40776.2020.9054443","DOIUrl":"https://doi.org/10.1109/ICASSP40776.2020.9054443","url":null,"abstract":"Channel estimation in hybrid analog-digital massive MIMO systems is a challenging problem due to the high channel dimension, low signal-to-noise ratio before beamforming, and reduced number of radio-frequency chains. Compressed sensing based algorithms have been adopted to address these challenges by leveraging the sparse nature of millimeter-wave MIMO channels. In compressed sensing-based methods, the training vectors should be designed carefully to guarantee recoverability. Although using random vectors has an overwhelming recoverability guarantee, it has been recently shown that an optimized update, which could be obtained so that the mutual coherence of the resulting sensing matrix is minimized, can improve the recoverability guarantee. In this paper, we propose an openloop hybrid analog-digital beam-training framework, where a given sensing matrix is decomposed into analog and digital beamformers. The given sensing matrix can be designed efficiently offline to reduce computational complexity. Simulation results show that the proposed training method achieves a lower mutual coherence and an improved channel estimation performance than the other benchmark methods.","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"1 1","pages":"4597-4601"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76943885","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Joint Training of Deep Neural Networks for Multi-Channel Dereverberation and Speech Source Separation 深度神经网络多通道去噪与语音源分离的联合训练
M. Togami
{"title":"Joint Training of Deep Neural Networks for Multi-Channel Dereverberation and Speech Source Separation","authors":"M. Togami","doi":"10.1109/ICASSP40776.2020.9053791","DOIUrl":"https://doi.org/10.1109/ICASSP40776.2020.9053791","url":null,"abstract":"In this paper, we propose a joint training of two deep neural networks (DNNs) for dereverberation and speech source separation. The proposed method connects the first DNN, the dereverberation part, the second DNN, and the speech source separation part in a cascade manner. The proposed method does not train each DNN separately. Instead, an integrated loss function which evaluates an output signal after dereverberation and speech source separation is adopted. The proposed method estimates the output signal as a probabilistic variable. Recently, in the speech source separation context, we proposed a loss function which evaluates the estimated posterior probability density function (PDF) of the output signal. In this paper, we extend this loss function into a loss function which evaluates not only speech source separation performance but also speech derevereberation performance. Since the output signal of the dereverberation part is converted into the input feature of the second DNN, gradient of the loss function is back-propagated into the first DNN through the input feature of the second DNN. Experimental results show that the proposed joint training of two DNNs is effective. It is also shown that the posterior PDF based loss function is effective in the joint training context.","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"10 1","pages":"3032-3036"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81344185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Deblurring And Super-Resolution Using Deep Gated Fusion Attention Networks For Face Images 基于深度门控融合注意网络的人脸图像去模糊和超分辨率
Chao Yang, Long-Wen Chang
{"title":"Deblurring And Super-Resolution Using Deep Gated Fusion Attention Networks For Face Images","authors":"Chao Yang, Long-Wen Chang","doi":"10.1109/ICASSP40776.2020.9053784","DOIUrl":"https://doi.org/10.1109/ICASSP40776.2020.9053784","url":null,"abstract":"Image deblurring and super-resolution are very important in image processing such as face verification. However, when in the outdoors, we often get blurry and low resolution images. To solve the problem, we propose a deep gated fusion attention network (DGFAN) to generate a high resolution image without blurring artifacts. We extract features from two task-independent structures for deburring and super-resolution to avoid the error propagation in the cascade structure of deblurring and super-resolution. We also add an attention module in our network by using channel-wise and spatial-wise features for better features and propose an edge loss function to make the model focus on facial features like eyes and nose. DGFAN performs favorably against the state-of-arts methods in terms of PSNR and SSIM. Also, using the clear images generated by DGFAN can improve the accuracy on face verification.","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"40 1","pages":"1623-1627"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81372307","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Adversarial Multi-Task Learning for Speaker Normalization in Replay Detection 重播检测中说话人归一化的对抗性多任务学习
Gajan Suthokumar, V. Sethu, Kaavya Sriskandaraja, E. Ambikairajah
{"title":"Adversarial Multi-Task Learning for Speaker Normalization in Replay Detection","authors":"Gajan Suthokumar, V. Sethu, Kaavya Sriskandaraja, E. Ambikairajah","doi":"10.1109/ICASSP40776.2020.9054322","DOIUrl":"https://doi.org/10.1109/ICASSP40776.2020.9054322","url":null,"abstract":"Spoofing detection algorithms in voice biometrics are adversely affected by differences in the speech characteristics of the various target users. In this paper, we propose a novel speaker normalisation technique that employs adversarial multi-task learning to compensate for this speaker variability. The proposed system is designed to learn a feature space that discriminates between genuine and replayed speech while simultaneously reduces the discrimination between different speakers. We initially characterise the impact of speaker variability and quantify the effect of the proposed speaker normalisation technique directly on the feature distributions. Following this, we validate the technique on spoofing detection experiments carried out on two different corpora, ASVSpoof 2017 v2.0 and BTAS 2016 replay, and demonstrate its effectiveness. We obtain EER of 7.11% and 0.83% on the two corpora respectively, lower than that of all relevant baselines.","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"18 1","pages":"6609-6613"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81944233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
A Differential Approach for Rain Field Tomographic Reconstruction Using Microwave Signals from Leo Satellites 利用低轨道卫星微波信号进行雨场层析成像的差分方法
Xi Shen, D. Huang, C. Vincent, Wenxiao Wang, R. Togneri
{"title":"A Differential Approach for Rain Field Tomographic Reconstruction Using Microwave Signals from Leo Satellites","authors":"Xi Shen, D. Huang, C. Vincent, Wenxiao Wang, R. Togneri","doi":"10.1109/ICASSP40776.2020.9054284","DOIUrl":"https://doi.org/10.1109/ICASSP40776.2020.9054284","url":null,"abstract":"A differential approach is proposed for tomographic rain field reconstruction using the estimated signal-to-noise ratio of microwave signals from low earth orbit satellites at the ground receivers, with the unknown baseline values eliminated before using least squares to reconstruct the attenuation field. Simulations are done when the baseline is modelled by an autoregressive process and when the baseline is assumed fixed. Comparisons between the reconstruction results for the differential and non-differential approaches suggest that the differential approach performs better in both scenarios. For high correlation coefficient and low model noise in the autoregressive process, the differential approach surpasses the non-differential approach significantly.","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"42 1","pages":"9001-9005"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82470731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Stochastic Graph Neural Networks 随机图神经网络
Zhan Gao, E. Isufi, Alejandro Ribeiro
{"title":"Stochastic Graph Neural Networks","authors":"Zhan Gao, E. Isufi, Alejandro Ribeiro","doi":"10.1109/ICASSP40776.2020.9054424","DOIUrl":"https://doi.org/10.1109/ICASSP40776.2020.9054424","url":null,"abstract":"Graph neural networks (GNNs) model nonlinear representations in graph data with applications in distributed agent coordination, control, and planning among others. However, current GNN implementations assume ideal distributed scenarios and ignore link fluctuations that occur due to environment or human factors. In these situations, the GNN fails to address its distributed task if the topological randomness is not considered accordingly. To overcome this issue, we put forth the stochastic graph neural network (SGNN) model: a GNN where the distributed graph convolutional operator is modified to account for the network changes. Since stochasticity brings in a new paradigm, we develop a novel learning process for the SGNN and introduce the stochastic gradient descent (SGD) algorithm to estimate the parameters. We prove through the SGD that the SGNN learning process converges to a stationary point under mild Lipschitz assumptions. Numerical simulations corroborate the proposed theory and show an improved performance of the SGNN compared with the conventional GNN when operating over random time varying graphs.","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"2007 1","pages":"9080-9084"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82504172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 25
Playing Technique Recognition by Joint Time–Frequency Scattering 基于联合时频散射的演奏技术识别
Changhong Wang, V. Lostanlen, Emmanouil Benetos, E. Chew
{"title":"Playing Technique Recognition by Joint Time–Frequency Scattering","authors":"Changhong Wang, V. Lostanlen, Emmanouil Benetos, E. Chew","doi":"10.1109/ICASSP40776.2020.9053474","DOIUrl":"https://doi.org/10.1109/ICASSP40776.2020.9053474","url":null,"abstract":"Playing techniques are important expressive elements in music signals. In this paper, we propose a recognition system based on the joint time–frequency scattering transform (jTFST) for pitch evolution-based playing techniques (PETs), a group of playing techniques with monotonic pitch changes over time. The jTFST represents spectro-temporal patterns in the time–frequency domain, capturing discriminative information of PETs. As a case study, we analyse three commonly used PETs of the Chinese bamboo flute: acciacatura, portamento, and glissando, and encode their characteristics using the jTFST. To verify the proposed approach, we create a new dataset, the CBF-petsDB, containing PETs played in isolation as well as in the context of whole pieces performed and annotated by professional players. Feeding the jTFST to a machine learning classifier, we obtain F-measures of 71% for acciacatura, 59% for portamento, and 83% for glissando detection, and provide explanatory visualisations of scattering coefficients for each technique.","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"1 1","pages":"881-885"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78833403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
An Efficient Augmented Lagrangian-Based Method for Linear Equality-Constrained Lasso 线性等式约束Lasso的一种有效增广拉格朗日方法
Zengde Deng, Man-Chung Yue, A. M. So
{"title":"An Efficient Augmented Lagrangian-Based Method for Linear Equality-Constrained Lasso","authors":"Zengde Deng, Man-Chung Yue, A. M. So","doi":"10.1109/ICASSP40776.2020.9053722","DOIUrl":"https://doi.org/10.1109/ICASSP40776.2020.9053722","url":null,"abstract":"Variable selection is one of the most important tasks in statistics and machine learning. To incorporate more prior information about the regression coefficients, various constrained Lasso models have been proposed in the literature. Compared with the classic (unconstrained) Lasso model, the algorithmic aspects of constrained Lasso models are much less explored. In this paper, we demonstrate how the recently developed semis-mooth Newton-based augmented Lagrangian framework can be extended to solve a linear equality-constrained Lasso model. A key technical challenge that is not present in prior works is the lack of strong convexity in our dual problem, which we overcome by adopting a regularization strategy. We show that under mild assumptions, our proposed method will converge superlinearly. Moreover, extensive numerical experiments on both synthetic and real-world data show that our method can be substantially faster than existing first-order methods while achieving a better solution accuracy.","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"11 1","pages":"5760-5764"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78894006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Volume Reconstruction for Light Field Microscopy 光场显微镜的体积重建
Herman Verinaz-Jadan, P. Song, Carmel L. Howe, Amanda J. Foust, P. Dragotti
{"title":"Volume Reconstruction for Light Field Microscopy","authors":"Herman Verinaz-Jadan, P. Song, Carmel L. Howe, Amanda J. Foust, P. Dragotti","doi":"10.1109/ICASSP40776.2020.9053433","DOIUrl":"https://doi.org/10.1109/ICASSP40776.2020.9053433","url":null,"abstract":"Light Field Microscopy (LFM) is a 3D imaging technique that captures volumetric information in a single snapshot. It is appealing in microscopy because of its simple implementation and the peculiarity that it is much faster than methods involving scanning. However, volume reconstruction for LFM suffers from low lateral resolution, high computational cost, and reconstruction artifacts near the native object plane. In this work, we make two contributions. First, we propose a simplification of the forward model based on a novel discretization approach that allows us to accelerate the computation without drastically increasing memory consumption. Second, we experimentally show that by including regularization priors and an appropriate initialization strategy, it is possible to remove the artifacts near the native object plane. The algorithm we use for this is ADMM. Finally, the combination of the two techniques leads to a method that outperforms classic volume reconstruction approaches (variants of Richardson-Lucy) in terms of average computational time and image quality (PSNR).","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"40 1","pages":"1459-1463"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78993354","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Supervised Deep Hashing for Efficient Audio Event Retrieval 有效音频事件检索的监督深度哈希
Arindam Jati, Dimitra Emmanouilidou
{"title":"Supervised Deep Hashing for Efficient Audio Event Retrieval","authors":"Arindam Jati, Dimitra Emmanouilidou","doi":"10.1109/ICASSP40776.2020.9053766","DOIUrl":"https://doi.org/10.1109/ICASSP40776.2020.9053766","url":null,"abstract":"Efficient retrieval of audio events can facilitate real-time implementation of numerous query and search-based systems. This work investigates the potency of different hashing techniques for efficient audio event retrieval. Multiple state-of-the-art weak audio embeddings are employed for this purpose. The performance of four classical unsupervised hashing algorithms is explored as part of off-the-shelf analysis. Then, we propose a partially supervised deep hashing framework that transforms the weak embeddings into a low-dimensional space while optimizing for efficient hash codes. The model uses only a fraction of the available labels and is shown here to significantly improve the retrieval accuracy on two widely employed audio event datasets. The extensive analysis and comparison between supervised and unsupervised hashing methods presented here, give insights on the quantizability of audio embeddings. This work provides a first look in efficient audio event retrieval systems and hopes to set baselines for future research.","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"8 1","pages":"4497-4501"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76293284","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信