2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)最新文献_第9页

Noisy objective functions based on the f-divergence 基于f散度的噪声目标函数

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2017-03-08 DOI: 10.1109/ICASSP.2017.7952572

M. Nußbaum-Thom, R. Schlüter, V. Goel, H. Ney

引用次数: 0

Evaluation of a complementary hearing aid for spatial sound segregation 空间声隔离辅助助听器的评价

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2017-03-08 DOI: 10.1109/ICASSP.2017.7952150

Luca Giuliani, L. Brayda, Sara Sansalone, S. Repetto, M. Ricchetti

引用次数: 3

Compressive information acquisition with hardware impairments and constraints: A case study 具有硬件缺陷和约束的压缩信息获取:一个案例研究

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2017-03-08 DOI: 10.1109/ICASSP.2017.7953323

S. Gopalakrishnan, T. Moy, Upamanyu Madhow, N. Verma

引用次数: 0

Automated robust Anuran classification by extracting elliptical feature pairs from audio spectrograms 从音频谱图中提取椭圆特征对的自动鲁棒Anuran分类

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2017-03-08 DOI: 10.1109/ICASSP.2017.7952610

Marcello Tomasini, Katrina Smart, R. Menezes, M. Bush, Eraldo Ribeiro

{"title":"Automated robust Anuran classification by extracting elliptical feature pairs from audio spectrograms","authors":"Marcello Tomasini, Katrina Smart, R. Menezes, M. Bush, Eraldo Ribeiro","doi":"10.1109/ICASSP.2017.7952610","DOIUrl":"https://doi.org/10.1109/ICASSP.2017.7952610","url":null,"abstract":"Ecologists can assess the health of wetlands by monitoring populations of animals such as Anurans (i.e., frogs and toads), which are sensitive to habitat changes. But, surveying anurans requires trained experts to identify species from the animals' mating calls. This identification task can be streamlined by automation. To this end, we propose an automatic frog-call classification algorithm and a smartphone application that drastically simplify the monitoring of anuran populations. We offer three main contributions. First, we introduce a classification method that has an average accuracy of 86% on a dataset of 736 calls from 48 anuran species from the United States. Our dataset is much larger and diverse than those of previous works on anuran classification. Second, we extract a new type of spectrogram feature that avoids syllable segmentation and the manual cleaning of the recordings. Our method also works with recordings of variable length. Third, our method uses GPS location and a voting scheme to reliably deal with a large number of species and high levels of noise.","PeriodicalId":118243,"journal":{"name":"2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131012217","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Unsupervised feature extraction for hyperspectral images using combined low rank representation and locally linear embedding 结合低秩表示和局部线性嵌入的高光谱图像无监督特征提取

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2017-03-08 DOI: 10.1109/ICASSP.2017.7952392

Mengdi Wang, Jing Yu, Lijuan Niu, Weidong Sun

引用次数: 8

Joint optimisation of tandem systems using Gaussian mixture density neural network discriminative sequence training 基于高斯混合密度神经网络判别序列训练的串联系统联合优化

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2017-03-08 DOI: 10.1109/ICASSP.2017.7953111

Chao Zhang, P. Woodland

引用次数: 15

Recovery of sparse signals via Branch and Bound Least-Squares 基于分支和界最小二乘的稀疏信号恢复

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2017-03-08 DOI: 10.1109/ICASSP.2017.7953060

Abolfazl Hashemi, H. Vikalo

引用次数: 4

Building recurrent networks by unfolding iterative thresholding for sequential sparse recovery 利用展开迭代阈值法构建递归网络进行序列稀疏恢复

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2017-03-08 DOI: 10.1109/ICASSP.2017.7952977

Scott Wisdom, Thomas Powers, J. Pitton, L. Atlas

{"title":"Building recurrent networks by unfolding iterative thresholding for sequential sparse recovery","authors":"Scott Wisdom, Thomas Powers, J. Pitton, L. Atlas","doi":"10.1109/ICASSP.2017.7952977","DOIUrl":"https://doi.org/10.1109/ICASSP.2017.7952977","url":null,"abstract":"Historically, sparse methods and neural networks, particularly modern deep learning methods, have been relatively disparate areas. Sparse methods are typically used for signal enhancement, compression, and recovery, usually in an unsupervised framework, while neural networks commonly rely on a supervised training set. In this paper, we use the specific problem of sequential sparse recovery, which models a sequence of observations over time using a sequence of sparse coefficients, to show how algorithms for sparse modeling can be combined with supervised deep learning to improve sparse recovery. Specifically, we show that the iterative soft-thresholding algorithm (ISTA) for sequential sparse recovery corresponds to a stacked recurrent neural network (RNN) under specific architecture and parameter constraints. Then we demonstrate the benefit of training this RNN with backpropagation using supervised data for the task of column-wise compressive sensing of images. This training corresponds to adaptation of the original iterative thresholding algorithm and its parameters. Thus, we show by example that sparse modeling can provide a rich source of principled and structured deep network architectures that can be trained to improve performance on specific tasks.","PeriodicalId":118243,"journal":{"name":"2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"173 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126575561","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 33

Applying compensation techniques on i-vectors extracted from short-test utterances for speaker verification using deep neural network 利用深度神经网络对短测话语提取的i向量进行补偿技术，用于说话人验证

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2017-03-08 DOI: 10.1109/ICASSP.2017.7953206

Il-Ho Yang, Hee-Soo Heo, Sung-Hyun Yoon, Ha-jin Yu

{"title":"Applying compensation techniques on i-vectors extracted from short-test utterances for speaker verification using deep neural network","authors":"Il-Ho Yang, Hee-Soo Heo, Sung-Hyun Yoon, Ha-jin Yu","doi":"10.1109/ICASSP.2017.7953206","DOIUrl":"https://doi.org/10.1109/ICASSP.2017.7953206","url":null,"abstract":"We propose a method to improve speaker verification performance when a test utterance is very short. In some situations with short test utterances, performance of ivector/probabilistic linear discriminant analysis systems degrades. The proposed method transforms short-utterance feature vectors to adequate vectors using a deep neural network, which compensate for short utterances. To reduce the dimensionality of the search space, we extract several principal components from the residual vectors between every long utterance i-vector in a development set and its truncated short utterance i-vector. Then an input i-vector of the network is transformed by linear combination of these directions. In this case, network outputs correspond to weights for linear combination of principal components. We use public speech databases to evaluate the method. The experimental results on short2-10sec condition (det6, male portion) of the NIST 2008 speaker recognition evaluation corpus show that the proposed method reduces the minimum detection cost relative to the baseline system, which uses linear discriminant analysis transformed i-vectors as features.","PeriodicalId":118243,"journal":{"name":"2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130029033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

A constrained adaptive scan order approach to transform coefficient entropy coding 一种变换系数熵编码的约束自适应扫描顺序方法

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2017-03-08 DOI: 10.1109/ICASSP.2017.7952366

Ching-Han Chiang, Jingning Han, Yaowu Xu

{"title":"A constrained adaptive scan order approach to transform coefficient entropy coding","authors":"Ching-Han Chiang, Jingning Han, Yaowu Xu","doi":"10.1109/ICASSP.2017.7952366","DOIUrl":"https://doi.org/10.1109/ICASSP.2017.7952366","url":null,"abstract":"Transform coefficient coding is a key module in modern video compression systems. Typically, a block of the quantized coefficients are processed in a pre-defined zig-zag order, starting from DC and sweeping through low frequency positions to high frequency ones. Correlation between magnitudes of adjacent coefficients is exploited via context based probability models to improve compression efficiency. Such scheme is premised on the assumption that spatial transforms compact energy towards lower frequency coefficients, and the scan pattern that follows a descending order of the likelihood of coefficients being non-zero provides more accurate probability modeling. However, a pre-defined zig-zag pattern that is agnostic to signal statistics may not be optimal. This work proposes an adaptive approach to generate scan pattern dynamically. Unlike prior attempts that directly sort a 2-D array of coefficient positions according to the appearance frequency of non-zero levels only, the proposed scheme employs a topological sort that also fully accounts for the spatial constraints due to the context dependency in entropy coding. A streamlined framework is designed for processing both intra and inter prediction residuals. This generic approach is experimentally shown to provide consistent coding performance gains across a wide range of test settings.","PeriodicalId":118243,"journal":{"name":"2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124003890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1