{"title":"ENF analysis on recaptured audio recordings","authors":"Hui Su, Ravi Garg, Adi Hajj-Ahmad, Min Wu","doi":"10.1109/ICASSP.2013.6638212","DOIUrl":"https://doi.org/10.1109/ICASSP.2013.6638212","url":null,"abstract":"Electric Network Frequency (ENF) based forensic analysis is a promising tool for timestamp authentication and forgery detection in such multimedia recordings as audios and videos. ENF signal is embedded in an audio recording due to electromagnetic interference from the power lines. The time of creation of a multimedia recording can be determined by comparing the ENF signal embedded in the recording with a reference ENF database collected from the power grid. In this paper, we conduct a study of the effect of recapturing of audio recordings on the ENF embedding. We demonstrate that recaptured audio recordings pick up two ENF signals: the content ENF signal which is inherited from the original audio recording; and the recapturing ENF signal which is embedded from the recapturing process. Conventional ENF signal extraction techniques on such recordings may fail when the two ENF signals are at the same nominal value. A decorrelation algorithm is proposed to extract the content ENF signal and the recapturing ENF signal. The experimental results show the effectiveness of the proposed method in the estimation of both the ENF signals.","PeriodicalId":183968,"journal":{"name":"2013 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129795502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. S. Pereira, A. Pagès-Zamora, Roberto López-Valcarce
{"title":"A diffusion-based distributed em algorithm for density estimation in wireless sensor networks","authors":"S. S. Pereira, A. Pagès-Zamora, Roberto López-Valcarce","doi":"10.1109/ICASSP.2013.6638501","DOIUrl":"https://doi.org/10.1109/ICASSP.2013.6638501","url":null,"abstract":"Distributed implementations of the Expectation-Maximization (EM) algorithm reported in literature have been proposed for applications to solve specific problems. In general, a primary requirement to derive a distributed solution is that the structure of the centralized version enables the computation involving global information in a distributed fashion. This paper treats the problem of distributed estimation of Gaussian densities by means of the EM algorithm in wireless sensor networks using diffusion strategies, where the information is gradually diffused across the network for the computation of the global functions. The low-complexity implementation presented here is based on a two time scale operation for information averaging and diffusion. The convergence to a fixed point of the centralized solution has been studied and the appealing results motivates our choice for this model. Numerical examples provided show that the performance of the distributed EM is, in practice, equal to that of the centralized scheme.","PeriodicalId":183968,"journal":{"name":"2013 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128204402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Robust semi-definite relaxation MIMO detection in a non-gaussian channel","authors":"Jakob Vovnoboy, A. Wiesel, Wing-Kin Ma","doi":"10.1109/ICASSP.2013.6638714","DOIUrl":"https://doi.org/10.1109/ICASSP.2013.6638714","url":null,"abstract":"Semi-definite relaxation (SDR) is a popular technique for Multi-Input Multi-Output (MIMO) detection. For Binary Phase-Shift Keying (BPSK) and Quadratic Phase-Shift Keying (QPSK), it has been found that SDR can provide a near-optimal Bit Error rate (BER) performance in a Gaussian channel. However if the noise in the channel deviates from the Gaussian model, as it does in many real wireless channels, BER performance drops considerably. In this paper we show that SDR can be applied for detection in a non-Gaussian channel using Huber's M-estimation method for robust regression.","PeriodicalId":183968,"journal":{"name":"2013 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"16 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128437508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A novel endmember, fractional abundance, and contrast model for hyperspectral imagery","authors":"S. Douglas","doi":"10.1109/ICASSP.2013.6638037","DOIUrl":"https://doi.org/10.1109/ICASSP.2013.6638037","url":null,"abstract":"In multispectral and hyperspectral image analysis for remote sensing, variations in contrast due to cloud shadows and topography can cause problems in the demixing process, creating false endmembers and erroneous fractional abundance images. This paper introduces a novel hyperspectral mixing model in which pixel contrast is accounted for explicitly in the image formation. A method is described for estimating the per-pixel contrast for any chosen endmember-based demixing algorithm. Applications of the method to both synthetic and real-world satellite imagery illustrate its efficacy.","PeriodicalId":183968,"journal":{"name":"2013 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128461321","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Döne Bugdayci Sansli, M. O. Bici, K. Ugur, M. Gabbouj
{"title":"Intra prediction mode coding for scalable HEVC","authors":"Döne Bugdayci Sansli, M. O. Bici, K. Ugur, M. Gabbouj","doi":"10.1109/ICASSP.2013.6637876","DOIUrl":"https://doi.org/10.1109/ICASSP.2013.6637876","url":null,"abstract":"High Efficiency Video Coding (HEVC) standard introduced an increased number of intra prediction directions in order to improve intra prediction performance by efficiently modeling the directional structures found in typical video contents. Efficient coding of intra prediction mode information is realized through a Most Probable Mode (MPM) list approach. In a scalable system, due to high correlation between the layers, utilization of base layer intra prediction mode can improve coding performance. In this paper, we propose a new intra prediction mode coding algorithm for scalable extension of HEVC where only the difference between the intra prediction modes of base and enhancement layers is coded. We provide experimental results and also a comparison of the proposed algorithm with an MPM list based approach where base layer intra prediction mode is added to the list as the most probable mode. Experimental results show BD-rate gains up to 1.1% in 2x spatial scalability and 0.7% in 1.5x scalability for all intra configuration.","PeriodicalId":183968,"journal":{"name":"2013 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128520788","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mohammed Senoussaoui, P. Kenny, P. Dumouchel, Themos Stafylakis
{"title":"Efficient iterative mean shift based cosine dissimilarity for multi-recording speaker clustering","authors":"Mohammed Senoussaoui, P. Kenny, P. Dumouchel, Themos Stafylakis","doi":"10.1109/ICASSP.2013.6639164","DOIUrl":"https://doi.org/10.1109/ICASSP.2013.6639164","url":null,"abstract":"Speaker clustering is an important task in many applications such as Speaker Diarization as well as Speech Recognition. Speaker clustering can be done within a single multi-speaker recording (Diarization) or for a set of different recordings. In this work we are interested by the former case and we propose a simple iterative Mean Shift (MS) algorithm to deal with this problem. Traditionally, MS algorithm is based on Euclidean distance. We propose to use the Cosine distance in order to build a new version of MS algorithm. We report results as measured by speaker and cluster impurities on NIST SRE 2008 datasets.","PeriodicalId":183968,"journal":{"name":"2013 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128534136","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Learning discriminative basis coefficients for eigenspace MLLR unsupervised adaptation","authors":"Yajie Miao, Florian Metze, A. Waibel","doi":"10.1109/ICASSP.2013.6639208","DOIUrl":"https://doi.org/10.1109/ICASSP.2013.6639208","url":null,"abstract":"Eigenspace MLLR is effective for fast adaptation when the amount of adaptation data is limited, e.g., less than 5s. The general motivation is to represent the MLLR transform as a linear combination of basis matrices. In this paper, we present a framework to estimate a speaker-independent discriminative transform over the combination coefficients. This discriminative basis coefficients transform (DBCT) is learned by optimizing discriminative criteria over all the training speakers. During recognition, the ML basis coefficients for each testing speaker are firstly found, on which DBCT is applied to give the final MLLR transform discrimination ability. Experiments show that DBCT results in consistent WER reduction in unsupervised adaptation, compared with both standard ML and discriminatively trained transforms.","PeriodicalId":183968,"journal":{"name":"2013 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128756544","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhancing with deconvolution the metrological performance of the grid method for in-plane strain measurement","authors":"F. Sur, M. Grédiac","doi":"10.1109/ICASSP.2013.6637914","DOIUrl":"https://doi.org/10.1109/ICASSP.2013.6637914","url":null,"abstract":"This article is motivated by a problem from experimental solid mechanics. The grid method permits to estimate in-plane displacement and strain components in a deformed material. A regular grid is deposited on the surface of the material, and images are taken before and after deformation. Windowed Fourier analysis then gives an estimate of the surface displacement and strain components. We show that the estimates obtained by this technique are approximately the convolution of the actual values with the analysis window. We also characterize how the noise in the grid image impairs the displacement and strain maps. Finally, the metrological performance of the grid method is enhanced with deconvolution algorithms. This work is potentially of interest in optical interferometry, since grids are particular fringe patterns.","PeriodicalId":183968,"journal":{"name":"2013 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128806167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Detecting asset value dislocations in multi-agent models of market microstructure","authors":"V. Krishnamurthy, Anup Aryan","doi":"10.1109/ICASSP.2013.6639372","DOIUrl":"https://doi.org/10.1109/ICASSP.2013.6639372","url":null,"abstract":"Consider a financial market participant observing the trade flow of an asset traded through a limit order book. Trades are driven by an agent-based model where individual agents observe the trading decisions of previous agents, as well as their private signal on the value of the asset and then execute a trading decision. Given trading decisions of agents, how can a market observer detect a shock to the underlying value of the traded asset? The distribution of shock times is assumed to be phase-type distributed to allow for a general set of change time probabilities beyond geometric change times. We show that this problem is equivalent to change detection with social learning. We provide structural results that allow the optimal detection policy to be characterized by a single threshold policy.","PeriodicalId":183968,"journal":{"name":"2013 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128616910","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Construction of an orthonormal complex multiresolution analysis","authors":"Liying Wei, T. Blu","doi":"10.1109/ICASSP.2013.6638081","DOIUrl":"https://doi.org/10.1109/ICASSP.2013.6638081","url":null,"abstract":"We design two complex filters {h[n], g[n])} for an orthogonal filter bank structure based on two atom functions {ρ<sub>0</sub><sup>α</sup>(t), ρ<sub>1/2</sub><sup>α</sup>(t)}, such that: 1) they generate an orthonormal multiwavelet basis; 2) the two complex conjugate wavelets are Hilbert wavelets, i.e., their frequency responses are supported either on positive or negative frequencies; and 3) the two scaling functions are real. The developed complex wavelet transform (CWT) is non-redundant, nearly shift-invariant, and distinguishable for diagonal features. The distinguishability in diagonal features is demonstrated by comparison with real discrete wavelet transform.","PeriodicalId":183968,"journal":{"name":"2013 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124665605","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}