2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)最新文献

筛选
英文 中文
Normalization of total variability matrix for i-vector/PLDA speaker verification i-vector/PLDA扬声器验证的总变异性矩阵归一化
Wei Rao, M. Mak, Kong-Aik Lee
{"title":"Normalization of total variability matrix for i-vector/PLDA speaker verification","authors":"Wei Rao, M. Mak, Kong-Aik Lee","doi":"10.1109/ICASSP.2015.7178758","DOIUrl":"https://doi.org/10.1109/ICASSP.2015.7178758","url":null,"abstract":"Gaussian PLDA with uncertainty propagation is effective for i-vector based speaker verification. The idea is to propagate the uncertainty of i-vectors caused by the duration variability of utterances to the PLDA model. However, a limitation of the method is the difficulty of performing length normalization on the posterior covariance matrix of an i-vector. This paper proposes a method to avoid performing length normalization on i-vectors in Gaussian PLDA modeling so that uncertainty propagation can be directly applied without transforming the posterior covariance matrices of i-vectors. Instead of performing length normalization on i-vectors independently, the proposed method normalizes the column vectors of the total variability matrix. Because the i-vectors of all utterances are derived from the same normalized total variability matrix, they will be subject to the same degree of normalization, thereby avoiding the undesirable distortion introduced by the utterance-dependent length-normalization process. Experimental results on both NIST 2010 and 2012 SREs demonstrate that the proposed method achieves a performance similar to (and in some situations better than) that of Gaussian PLDA with length normalization. The method has the potential of improving the performance of uncertainty propagation for i-vector/PLDA speaker verification.","PeriodicalId":117666,"journal":{"name":"2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114971306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
An Encryption-then-Compression system for JPEG 2000 standard JPEG 2000标准的先加密后压缩系统
Osamu Watanabe, Akira Uchida, T. Fukuhara, H. Kiya
{"title":"An Encryption-then-Compression system for JPEG 2000 standard","authors":"Osamu Watanabe, Akira Uchida, T. Fukuhara, H. Kiya","doi":"10.1109/ICASSP.2015.7178165","DOIUrl":"https://doi.org/10.1109/ICASSP.2015.7178165","url":null,"abstract":"A new Encryption-then-Compression (ETC) system for the JPEG 2000 standard is proposed in this paper. An ETC system is known as a system that makes image communication secure and efficient by using perceptual encryption and image compression. The proposed system uses the sign-scrambling and block-shuffling of discrete wavelet transform (DWT) coefficients as perceptual encryption. Unlike conventional ETC systems, the proposed system is compatible with the JPEG 2000 standard because the perceptually encrypted coefficients can be efficiently compressed by the JPEG 2000. The experimental results demonstrated that the proposed system achieved both acceptable compression performance and enough key-space for secure image communication while remaining compatible with the JPEG 2000 standard.","PeriodicalId":117666,"journal":{"name":"2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114974018","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 50
Efficient spectrogram-based binary image feature for audio copy detection 高效的基于谱图的二值图像特征音频拷贝检测
Chahid Ouali, P. Dumouchel, Vishwa Gupta
{"title":"Efficient spectrogram-based binary image feature for audio copy detection","authors":"Chahid Ouali, P. Dumouchel, Vishwa Gupta","doi":"10.1109/ICASSP.2015.7178279","DOIUrl":"https://doi.org/10.1109/ICASSP.2015.7178279","url":null,"abstract":"This paper presents the latest improvements on our Spectro system that detects transformed duplicate audio content. We propose a new binary image feature derived from a spectrogram matrix by using a threshold based on the average of the spectral values. We quantize this binary image by applying a tile of fixed size and computing the sum of each small square in the tile. Fingerprints of each binary image encode the positions of the selected tiles. Evaluation on TRECVID 2010 CBCD data shows that this new feature improves significantly the Spectro system for transformations that add irrelevant speech to the audio. Compared to a state-of-the-art audio fingerprinting system, the proposed method reduces the minimal Normalized Detection Cost Rate (min NDCR) by 33%, improves localization accuracy by 28% and results in 40% fewer missed queries.","PeriodicalId":117666,"journal":{"name":"2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115127581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Paraphrastic recurrent neural network language models 释义递归神经网络语言模型
Xunying Liu, Xie Chen, M. Gales, P. Woodland
{"title":"Paraphrastic recurrent neural network language models","authors":"Xunying Liu, Xie Chen, M. Gales, P. Woodland","doi":"10.1109/ICASSP.2015.7179004","DOIUrl":"https://doi.org/10.1109/ICASSP.2015.7179004","url":null,"abstract":"Recurrent neural network language models (RNNLM) have become an increasingly popular choice for state-of-the-art speech recognition systems. Linguistic factors in??uencing the realization of surface word sequences, for example, expressive richness, are only implicitly learned by RNNLMs. Observed sentences and their associated alternative paraphrases representing the same meaning are not explicitly related during training. In order to improve context coverage and generalization, paraphrastic RNNLMs are investigated in this paper. Multiple paraphrase variants were automatically generated and used in paraphrastic RNNLM training. Using a paraphrastic multi-level RNNLM modelling both word and phrase sequences, signi??cant error rate reductions of 0.6% absolute and perplexity reduction of 10% relative were obtained over the baseline RNNLM on a large vocabulary conversational telephone speech recognition system trained on 2000 hours of audio and 545 million words of texts. The overall improvement over the baseline n-gram LM was increased from 8.4% to 11.6% relative.","PeriodicalId":117666,"journal":{"name":"2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115618485","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Combining two phase codes to extend the radar unambiguous range and get a trade-off in terms of performance for any clutter 结合两相码扩展雷达的无二义距离,并在任何杂波条件下获得性能上的折衷
T. Rouffet, É. Grivel, P. Vallet, C. Enderli, S. Kemkemian
{"title":"Combining two phase codes to extend the radar unambiguous range and get a trade-off in terms of performance for any clutter","authors":"T. Rouffet, É. Grivel, P. Vallet, C. Enderli, S. Kemkemian","doi":"10.1109/ICASSP.2015.7178285","DOIUrl":"https://doi.org/10.1109/ICASSP.2015.7178285","url":null,"abstract":"This paper deals with a phase-coded waveform which combines two binary phase codes, each impacting on specific properties of the radar receiving channel. After giving a detailed analysis of the expression of the received signal after processing when the Gaussian clutter is modeled by a pth-order autoregressive process, we focus our attention on the choice of the two phase codes: one aims at increasing the unambiguous range whereas the other is chosen by taking into account several criteria such as the detection performance. For this purpose, we suggest determining the Pareto fronts of 1st, 2nd and 3rd orders by means of an exhaustive search. Given the three Pareto fronts for different types of clutters simulated by making the AR parameters vary, we provide an automatic way to determine, before embedding it, the most robust phase codes using a fuzzy logic operator.","PeriodicalId":117666,"journal":{"name":"2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"99 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117118065","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Single underwater image descattering and color correction 单水下图像散射和色彩校正
Huimin Lu, Yujie Li, S. Serikawa
{"title":"Single underwater image descattering and color correction","authors":"Huimin Lu, Yujie Li, S. Serikawa","doi":"10.1109/ICASSP.2015.7178245","DOIUrl":"https://doi.org/10.1109/ICASSP.2015.7178245","url":null,"abstract":"Absorption, scattering, and color distortion are three major issues in underwater optical imaging. Light rays traveling through water are scattered and absorbed according to their wavelength. Scattering is caused by large suspended particles that degrade optical images captured underwater. Color distortion occurs because different wavelengths are attenuated to different degrees in water; consequently, images of ambient underwater environments are dominated by a bluish tone. In the present paper, we propose a novel underwater imaging model that compensates for the attenuation discrepancy along the propagation path. In addition, we develop a fast weighted guided normalized convolution domain filtering algorithm for enhancing underwater optical images in shallow oceans. The enhanced images are characterized by a reduced noised level, better exposure in dark regions, and improved global contrast, by which the finest details and edges are enhanced significantly.","PeriodicalId":117666,"journal":{"name":"2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117181531","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Improved linear least squares estimation using bounded data uncertainty 利用有界数据不确定性改进线性最小二乘估计
Tarig Ballal, T. Al-Naffouri
{"title":"Improved linear least squares estimation using bounded data uncertainty","authors":"Tarig Ballal, T. Al-Naffouri","doi":"10.1109/ICASSP.2015.7178607","DOIUrl":"https://doi.org/10.1109/ICASSP.2015.7178607","url":null,"abstract":"This paper addresses the problemof linear least squares (LS) estimation of a vector x from linearly related observations. In spite of being unbiased, the original LS estimator suffers from high mean squared error, especially at low signal-to-noise ratios. The mean squared error (MSE) of the LS estimator can be improved by introducing some form of regularization based on certain constraints. We propose an improved LS (ILS) estimator that approximately minimizes the MSE, without imposing any constraints. To achieve this, we allow for perturbation in the measurement matrix. Then we utilize a bounded data uncertainty (BDU) framework to derive a simple iterative procedure to estimate the regularization parameter. Numerical results demonstrate that the proposed BDU-ILS estimator is superior to the original LS estimator, and it converges to the best linear estimator, the linear-minimum-mean-squared error estimator (LMMSE), when the elements of x are statistically white.","PeriodicalId":117666,"journal":{"name":"2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"186 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120864848","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Learning the sparsity basis in low-rank plus sparse model for dynamic MRI reconstruction 学习用于动态MRI重建的低秩加稀疏模型的稀疏基
A. Majumdar, R. Ward
{"title":"Learning the sparsity basis in low-rank plus sparse model for dynamic MRI reconstruction","authors":"A. Majumdar, R. Ward","doi":"10.1109/ICASSP.2015.7178075","DOIUrl":"https://doi.org/10.1109/ICASSP.2015.7178075","url":null,"abstract":"Modeling a temporal image sequence as a super-position of sparse and low-rank component stems from studies in principal component pursuit (PCP). Recently this technique was applied for dynamic MRI reconstruction with two modifications. First, unlike the original PCP, the problem was to recover the image sequence from under-sampled measurements. Second, the sparse component of the signal was not sparse in itself but in a transform domain. Recent studies in dynamic MRI reconstruction showed that, instead of using a fixed sparsity basis, better recovery results can be achieved when the sparsifying dictionary is adaptively learned from the data using Blind Compressed Sensing (BCS) framework. In this work, we demonstrate that learning the sparsity basis using BCS like techniques improve the recovery accuracy from PCP when applied to dynamic MRI reconstruction problems.","PeriodicalId":117666,"journal":{"name":"2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121037832","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Face hallucination via Cauchy regularized sparse representation 基于柯西正则化稀疏表示的人脸幻觉
Shenming Qu, R. Hu, Shihong Chen, Zhongyuan Wang, Junjun Jiang, Cheng Yang
{"title":"Face hallucination via Cauchy regularized sparse representation","authors":"Shenming Qu, R. Hu, Shihong Chen, Zhongyuan Wang, Junjun Jiang, Cheng Yang","doi":"10.1109/ICASSP.2015.7178163","DOIUrl":"https://doi.org/10.1109/ICASSP.2015.7178163","url":null,"abstract":"In dictionary-learning-based face hallucination, the testing image is represented as a linear combination of the training samples, and how to obtain the optimal coefficients is the primary issue. Sparse representation (SR) has ever been widely used in face hallucination, however, due to the fact that SR overemphasizes the sparsity, the obtained linear combination coefficients turn out far aggressively sparse, then leading to unsatisfactory hallucinated results. In this paper, we present a moderately sparse prior model for face hallucination problem with the L1 norm penalty in classic SR replaced by a Cauchy penalty term. An iterative optimization is further presented to solve the minimization of Cauchy regularized objective function. The experimental results on public face database demonstrate that our method is much more effective than state-of-the-art methods.","PeriodicalId":117666,"journal":{"name":"2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121250226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Individualizing a monaural beamformer for cochlear implant users 针对人工耳蜗使用者的个体化单耳波束形成器
Waldo Nogueira, Marta Lopez, Thilo Rode, S. Doclo, A. Büchner
{"title":"Individualizing a monaural beamformer for cochlear implant users","authors":"Waldo Nogueira, Marta Lopez, Thilo Rode, S. Doclo, A. Büchner","doi":"10.1109/ICASSP.2015.7179071","DOIUrl":"https://doi.org/10.1109/ICASSP.2015.7179071","url":null,"abstract":"Speech intelligibility in noisy environments is still quite limited for cochlear implant (CI) users. Classical beamformers such as the Generalized Sidelobe Canceller (GSC) can provide large improvements in speech intelligibility for CI users. These algorithms have been adopted from hearing aids and multimedia applications into the CI field. However, their optimization taking into consideration the peculiarities of electrical hearing with a CI has not yet been completely investigated. This paper presents a novel method to optimize the performance of a GSC for each individual CI user. We show through a combination of objective and novel subjective measures, how much distortion can be tolerated by a CI user without decreasing speech intelligibility. Experimental results with 5 CI users show that a GSC delivering just noticeable distortion is the one maximizing speech intelligibility for CI users.","PeriodicalId":117666,"journal":{"name":"2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"1119 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127107386","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信