ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)最新文献_第8页

Raw Waveform Based End-to-end Deep Convolutional Network for Spatial Localization of Multiple Acoustic Sources

ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2020-05-01 DOI: 10.1109/ICASSP40776.2020.9054090

Harshavardhan Sundar, Weiran Wang, Ming Sun, Chao Wang

引用次数: 28

Projection Free Dynamic Online Learning 投影免费动态在线学习

ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2020-05-01 DOI: 10.1109/ICASSP40776.2020.9053771

Deepak S. Kalhan, A. S. Bedi, Alec Koppel, K. Rajawat, Abhishek K. Gupta, Adrish Banerjee

引用次数: 2

Angular Discriminative Deep Feature Learning for Face Verification 面向人脸验证的角度判别深度特征学习

ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2020-05-01 DOI: 10.1109/ICASSP40776.2020.9053675

Bowen Wu, Huaming Wu

{"title":"Angular Discriminative Deep Feature Learning for Face Verification","authors":"Bowen Wu, Huaming Wu","doi":"10.1109/ICASSP40776.2020.9053675","DOIUrl":"https://doi.org/10.1109/ICASSP40776.2020.9053675","url":null,"abstract":"Thanks to the development of deep Convolutional Neural Network (CNN), face verification has achieved great success rapidly. Specifically, Deep Distance Metric Learning (DDML), as an emerging area, has achieved great improvements in computer vision community. Softmax loss is widely used to supervise the training of most available CNN models. Whereas, feature normalization is often used to compute the pair similarities when testing. In order to bridge the gap between training and testing, we require that the intra-class cosine similarity of the inner-product layer before softmax loss is larger than a margin in the training step, accompanied by the supervision signal of softmax loss. To enhance the discriminative power of the deeply learned features, we extend the intra-class constraint to force the intra-class cosine similarity larger than the mean of nearest neighboring inter-class ones with a margin in the normalized exponential feature projection space. Extensive experiments on Labeled Face in the Wild (LFW) and Youtube Faces (YTF) datasets demonstrate that the proposed approaches achieve competitive performance for the open-set face verification task.","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"98 1","pages":"2133-2137"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76536403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Slicenet: Slice-Wise 3D Shapes Reconstruction from Single Image 切片:从单个图像的切片三维形状重建

ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2020-05-01 DOI: 10.1109/ICASSP40776.2020.9054674

Yunjie Wu, Zhengxing Sun, Youcheng Song, Yunhan Sun, Jinlong Shi

{"title":"Slicenet: Slice-Wise 3D Shapes Reconstruction from Single Image","authors":"Yunjie Wu, Zhengxing Sun, Youcheng Song, Yunhan Sun, Jinlong Shi","doi":"10.1109/ICASSP40776.2020.9054674","DOIUrl":"https://doi.org/10.1109/ICASSP40776.2020.9054674","url":null,"abstract":"3D object reconstruction from a single image is a highly ill-posed problem, requiring strong prior knowledge of 3D shapes. Deep learning methods are popular for this task. Especially, most works utilized 3D deconvolution to generate 3D shapes. However, the resolution of results is limited by the high resource consumption of 3D deconvolution. In this paper, we propose SliceNet, sequentially generating 2D slices of 3D shapes with shared 2D deconvolution parameters. To capture relations between slices, the RNN is also introduced. Our model has three main advantages: First, the introduction of RNN allows the CNN to focus more on local geometry details,improving the results’ fine-grained plausibility. Second, replacing 3D deconvolution with 2D deconvolution reducs much consumption of memory, enabling higher resolution of final results. Third, an slice-aware attention mechanism is designed to provide dynamic information for each slice’s generation, which helps modeling the difference between multiple slices, making the learning process easier. Experiments on both synthesized data and real data illustrate the effectiveness of our method.","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"40 1","pages":"1833-1837"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76188018","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Robust Phase Retrieval with Outliers 基于异常值的鲁棒相位检索

ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2020-05-01 DOI: 10.1109/ICASSP40776.2020.9053060

Xue Jiang, H. So, Xingzhao Liu

引用次数: 1

Exploiting Vocal Tract Coordination Using Dilated CNNS For Depression Detection In Naturalistic Environments 利用扩张型CNNS在自然环境中进行抑郁检测的声道协调

ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2020-05-01 DOI: 10.1109/ICASSP40776.2020.9054323

Zhaocheng Huang, J. Epps, Dale Joachim

引用次数: 26

Rnn-Transducer with Stateless Prediction Network 基于无状态预测网络的rnn换能器

ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2020-05-01 DOI: 10.1109/ICASSP40776.2020.9054419

M. Ghodsi, Xiaofeng Liu, J. Apfel, Rodrigo Cabrera, Eugene Weinstein

{"title":"Rnn-Transducer with Stateless Prediction Network","authors":"M. Ghodsi, Xiaofeng Liu, J. Apfel, Rodrigo Cabrera, Eugene Weinstein","doi":"10.1109/ICASSP40776.2020.9054419","DOIUrl":"https://doi.org/10.1109/ICASSP40776.2020.9054419","url":null,"abstract":"The RNN-Transducer (RNNT) outperforms classic Automatic Speech Recognition (ASR) systems when a large amount of supervised training data is available. For low-resource languages, the RNNT models overfit, and can not directly take advantage of additional large text corpora as in classic ASR systems.We focus on the prediction network of the RNNT, since it is believed to be analogous to the Language Model (LM) in the classic ASR systems. We pre-train the prediction network with text-only data, which is not helpful. Moreover, removing the recurrent layers from the prediction network, which makes the prediction network stateless, performs virtually as well as the original RNNT model, when using wordpieces. The stateless prediction network does not depend on the previous output symbols, except the last one. Therefore it simplifies the RNNT architectures and the inference.Our results suggest that the RNNT prediction network does not function as the LM in classical ASR. Instead, it merely helps the model align to the input audio, while the RNNT encoder and joint networks capture both the acoustic and the linguistic information.","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"39 1","pages":"7049-7053"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87004778","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 77

Favorable Propagation and Linear Multiuser Detection for Distributed Antenna Systems 分布式天线系统的有利传播与线性多用户检测

ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2020-05-01 DOI: 10.1109/ICASSP40776.2020.9053449

R. Gholami, L. Cottatellucci, D. Slock

引用次数: 5

High-Accuracy Classification of Attention Deficit Hyperactivity Disorder with L2,1-Norm Linear Discriminant Analysis 注意缺陷多动障碍的L2,1-范数线性判别分析

ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2020-05-01 DOI: 10.1109/ICASSP40776.2020.9053391

Yibin Tang, Xufei Li, Ying Chen, Y. Zhong, A. Jiang, Xiaofeng Liu

{"title":"High-Accuracy Classification of Attention Deficit Hyperactivity Disorder with L2,1-Norm Linear Discriminant Analysis","authors":"Yibin Tang, Xufei Li, Ying Chen, Y. Zhong, A. Jiang, Xiaofeng Liu","doi":"10.1109/ICASSP40776.2020.9053391","DOIUrl":"https://doi.org/10.1109/ICASSP40776.2020.9053391","url":null,"abstract":"Attention Deficit Hyperactivity Disorder (ADHD) is a high incidence of neurobehavioral disease in school-age children. Its neurobiological classification is meaningful for clinicians. The existing ADHD classification methods suffer from two problems, i.e., insufficient data and noise disturbance. Here, a high-accuracy classification method is proposed, which uses brain Functional Connectivity (FC) as material for ADHD feature analysis. In detail, we introduce a binary hypothesis testing framework as the classification outline to cope with insufficient data of ADHD database. Under binary hypotheses, the FCs of test data are allowed to use for training and thus affect the subspace learning of training data. To overcome noise disturbance, an l2,1-norm LDA model is adopted to robustly learn ADHD features in subspaces. The subspace energies of training data under binary hypotheses are then calculated, and an energy-based comparison is finally performed to identify ADHD individuals. On the platform of ADHD-200 database, the experiments show our method outperforms other state-of-the-art methods with the significant average accuracy of 97.6%.","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"108 1","pages":"1170-1174"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87589816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Redundant Convolutional Network With Attention Mechanism For Monaural Speech Enhancement 基于注意机制的冗余卷积网络单词语音增强

ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2020-05-01 DOI: 10.1109/ICASSP40776.2020.9053277

Tian Lan, Yilan Lyu, Guoqiang Hui, Refuoe Mokhosi, Sen Li, Qiao Liu

引用次数: 2