2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)最新文献

筛选
英文 中文
Normalized amplitude modulation features for large vocabulary noise-robust speech recognition 大词汇噪声鲁棒语音识别的归一化调幅特征
V. Mitra, H. Franco, M. Graciarena, Arindam Mandal
{"title":"Normalized amplitude modulation features for large vocabulary noise-robust speech recognition","authors":"V. Mitra, H. Franco, M. Graciarena, Arindam Mandal","doi":"10.1109/ICASSP.2012.6288824","DOIUrl":"https://doi.org/10.1109/ICASSP.2012.6288824","url":null,"abstract":"Background noise and channel degradations seriously constrain the performance of state-of-the-art speech recognition systems. Studies comparing human speech recognition performance with automatic speech recognition systems indicate that the human auditory system is highly robust against background noise and channel variabilities compared to automated systems. A traditional way to add robustness to a speech recognition system is to construct a robust feature set for the speech recognition model. In this work, we present an amplitude modulation feature derived from Teager's nonlinear energy operator that is power normalized and cosine transformed to produce normalized modulation cepstral coefficient (NMCC) features. The proposed NMCC features are compared with respect to state-of-the-art noise-robust features in Aurora-2 and a renoised Wall Street Journal (WSJ) corpus. The WSJ word-recognition experiments were performed on both a clean and artificially renoised WSJ corpus using SRI's DECIPHER large vocabulary speech recognition system. The experiments were performed under three train-test conditions: (a) matched, (b) mismatched, and (c) multi-conditioned. The Aurora-2 digit recognition task was performed using the standard HTK recognizer distributed with Aurora-2. Our results indicate that the proposed NMCC features demonstrated noise robustness in almost all the training-test conditions of renoised WSJ data and also improved digit recognition accuracies for Aurora-2 compared to the MFCCs and state-of-the-art noise-robust features.","PeriodicalId":6443,"journal":{"name":"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2012-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85498682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 104
A family of Bounded Component Analysis algorithms 一类有界分量分析算法
A. Erdogan
{"title":"A family of Bounded Component Analysis algorithms","authors":"A. Erdogan","doi":"10.1109/ICASSP.2012.6288270","DOIUrl":"https://doi.org/10.1109/ICASSP.2012.6288270","url":null,"abstract":"Bounded Component Analysis (BCA) has recently been introduced as an alternative method for the Blind Source Separation problem. Under the generic assumption on source boundedness, BCA provides a flexible framework for the separation of dependent (even correlated) as well as independent sources. This article provides a family of algorithms derived based on the geometric picture implied by the founding assumptions of the BCA approach. We also provide a numerical example demonstrating the ability of the proposed algorithms to separate mixtures of some dependent sources.","PeriodicalId":6443,"journal":{"name":"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2012-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85551713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Automatic generation of synthesizable hardware implementation from high level RVC-cal description 从高级RVC-cal描述自动生成可合成的硬件实现
Khaled Jerbi, M. Raulet, O. Déforges, M. Abid
{"title":"Automatic generation of synthesizable hardware implementation from high level RVC-cal description","authors":"Khaled Jerbi, M. Raulet, O. Déforges, M. Abid","doi":"10.1109/ICASSP.2012.6288199","DOIUrl":"https://doi.org/10.1109/ICASSP.2012.6288199","url":null,"abstract":"Data process algorithms are increasing in complexity especially for image and video coding. Therefore, hardware development using directly hardware description languages (HDL) such as VHDL or Verilog is a difficult task. Current research axes in this context are introducing new methodologies to automate the generation of such descriptions. In our work we adopted a high level and target-independent language called CAL (Caltrop Actor Language). This language is associated with a set of tools to easily design dataflow applications and also a hardware compiler to automatically generate the implementation. Before the modifications presented in this paper, the existing CAL hardware back-end did not support some high-level features of the CAL language. Consequently, high-level designed actors had to be manually transformed to be synthesizable. In this paper, we introduce a general automatic transformation of CAL descriptions to make these structures compliant and synthesizable. This transformation analyses the CAL code, detects the target features and makes the required changes to obtain synthesizable code while keeping the same application behavior. This work resolves the main bottleneck of the hardware generation flow from CAL designs.","PeriodicalId":6443,"journal":{"name":"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2012-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85692792","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Graph spectral compressed sensing for sensor networks 用于传感器网络的图谱压缩感知
Xiaofan Zhu, M. Rabbat
{"title":"Graph spectral compressed sensing for sensor networks","authors":"Xiaofan Zhu, M. Rabbat","doi":"10.1109/ICASSP.2012.6288515","DOIUrl":"https://doi.org/10.1109/ICASSP.2012.6288515","url":null,"abstract":"Consider a wireless sensor network with N sensor nodes measuring data which are correlated temporally or spatially. We consider the problem of reconstructing the original data by only transmitting M ≪ N sensor readings while guaranteeing that the reconstruction error is small. Assuming the original signal is “smooth” with respect to the network topology, our approach is to gather measurements from a random subset of nodes and then interpolate with respect to the graph Laplacian eigenbasis, leveraging ideas from compressed sensing. We propose algorithms for both temporally and spatially correlated signals, and the performance of these algorithms is verified using both synthesized data and real world data. Significant savings are made in terms of energy resources, bandwidth, and query latency.","PeriodicalId":6443,"journal":{"name":"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2012-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84595318","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 56
Noncoherent misbehavior detection in space-time coded cooperative networks 时空编码合作网络中的非相干异常行为检测
Li-Chung Lo, Zhao-Jie Wang, Wan-Jen Huang
{"title":"Noncoherent misbehavior detection in space-time coded cooperative networks","authors":"Li-Chung Lo, Zhao-Jie Wang, Wan-Jen Huang","doi":"10.1109/ICASSP.2012.6288561","DOIUrl":"https://doi.org/10.1109/ICASSP.2012.6288561","url":null,"abstract":"Consider a two-relay decode-and-forward (DF) cooperative network where Alamouti coding is adopted among relays to exploit spatial diversity. However, the spatial diversity gain is diminished with the existence of misbehaving relays. Most existing work on detecting malicious relays requires the knowledge of instantaneous channel status, which is usually unavailable if the relays garble retransmitted signals deliberately. With this regard, we propose a noncoherent misbehavior detection using the second-order statistics of channel estimates for relay-destination links. It shows from simulation results that increasing the number of received blocks provides significant improvement even at low SNR regime.","PeriodicalId":6443,"journal":{"name":"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2012-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77671094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Detecting passive eavesdroppers in the MIMO wiretap channel 在MIMO窃听信道中检测无源窃听者
A. Mukherjee, A. L. Swindlehurst
{"title":"Detecting passive eavesdroppers in the MIMO wiretap channel","authors":"A. Mukherjee, A. L. Swindlehurst","doi":"10.1109/ICASSP.2012.6288501","DOIUrl":"https://doi.org/10.1109/ICASSP.2012.6288501","url":null,"abstract":"The MIMO wiretap channel comprises a passive eavesdropper that attempts to intercept communications between an authorized transmitter-receiver pair, with each node being equipped with multiple antennas. In a dynamic network, it is imperative that the presence of a passive eavesdropper be determined before the transmitter can deploy robust secrecy-encoding schemes as a countermeasure. This is a difficult task in general, since by definition the eavesdropper is passive and never transmits. In this work we adopt a method that allows the legitimate nodes to detect the passive eavesdropper from the local oscillator power that is inadvertently leaked from its RF front end. We examine the performance of non-coherent energy detection as well as optimal coherent detection schemes. We then show how the proposed detectors allow the legitimate nodes to increase the MIMO secrecy rate of the channel.","PeriodicalId":6443,"journal":{"name":"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2012-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78154830","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 147
MLLR transforms of self-organized units as features in speaker recognition 自组织单元的MLLR变换作为特征在说话人识别中
M. Siu, Omer Lang, H. Gish, S. Lowe, Arthur Chan, O. Kimball
{"title":"MLLR transforms of self-organized units as features in speaker recognition","authors":"M. Siu, Omer Lang, H. Gish, S. Lowe, Arthur Chan, O. Kimball","doi":"10.1109/ICASSP.2012.6288891","DOIUrl":"https://doi.org/10.1109/ICASSP.2012.6288891","url":null,"abstract":"Using speaker adaptation parameters, such as maximum likelihood linear regression (MLLR) adaptation matrices, as features for speaker recognition (SR) has been shown to perform well and can also provide complementary information for fusion with other acoustic-based SR systems, such as GMM-based systems. In order to estimate the adaptation parameters, a speech recognizer in the SR domain is required which in turn requires transcribed training data for recognizer training. This limits the approach only to domains where training transcriptions are available. To generalize the adaptation parameter approach to domains without transcriptions, we propose the use of self-organized unit recognizers that can be trained without supervision (or transcribed data). We report results on the 2002 NIST speaker recognition evaluation (SRE2002) extended data set and show that using MLLR parameters estimated from SOU recognizers give comparable performance to systems using a matched recognizers. SOU recognizers also outperform those using cross-lingual recognizers. When we fused the SOU- and word recognizers, SR equal error rate (EER) can be reduced by another 15%. This suggests SOU recognizers can be useful whether or not transcribed data for recognition training are available.","PeriodicalId":6443,"journal":{"name":"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2012-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72910724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Minimax design of sparse FIR digital filters 稀疏FIR数字滤波器的极大极小设计
A. Jiang, H. Kwan, Yanping Zhu, Xiaofeng Liu
{"title":"Minimax design of sparse FIR digital filters","authors":"A. Jiang, H. Kwan, Yanping Zhu, Xiaofeng Liu","doi":"10.1109/ICASSP.2012.6288670","DOIUrl":"https://doi.org/10.1109/ICASSP.2012.6288670","url":null,"abstract":"In this paper, we present a novel algorithm to design sparse FIR digital filters in the minimax sense. To tackle the nonconvexity of the design problem, an efficient iterative procedure is developed to find a potential sparsity pattern. In each iteration, a subproblem in a simpler form is constructed. Instead of directly resolving these nonconvex subproblems, we resort to their respective dual problems. It can be proved that under a weak condition, globally optimal solutions of these subproblems can be attained by solving their dual problems. In this case, the overall iterative procedure can converge to a locally optimal solution of the original design problem. The real minimax design can then be achieved by refining the FIR filter obtained by the iterative procedure. The design procedure described above can be repeated for several times to further improve the sparsity of design results. The output of the previous stage can be used as the initial point of the subsequent design. Simulation results demonstrate the effectiveness of our proposed algorithm.","PeriodicalId":6443,"journal":{"name":"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2012-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79912214","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Connexions and the SPEN fellows program 连接和SPEN研究员计划
T. Welch, M. Morrow, C. Wright
{"title":"Connexions and the SPEN fellows program","authors":"T. Welch, M. Morrow, C. Wright","doi":"10.1109/ICASSP.2012.6288495","DOIUrl":"https://doi.org/10.1109/ICASSP.2012.6288495","url":null,"abstract":"Texas Instruments (TI) has created the Signal Processing Education Network (SPEN) Fellows program to help identify and fill content gaps within the signal processing content library hosted on the Connexions ecosystem. This paper will overview Connexions, SPEN, the SPEN Fellows program, and review this year's SPEN Fellows project involving Connexions content creation involving real-time DSP (RT-DSP).","PeriodicalId":6443,"journal":{"name":"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2012-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79962826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Shift-variant non-negative matrix deconvolution for music transcription 音乐转录的移位非负矩阵反褶积
Holger Kirchhoff, S. Dixon, Anssi Klapuri
{"title":"Shift-variant non-negative matrix deconvolution for music transcription","authors":"Holger Kirchhoff, S. Dixon, Anssi Klapuri","doi":"10.1109/ICASSP.2012.6287833","DOIUrl":"https://doi.org/10.1109/ICASSP.2012.6287833","url":null,"abstract":"In this paper, we address the task of semi-automatic music transcription in which the user provides prior information about the polyphonic mixture under analysis. We propose a non-negative matrix deconvolution framework for this task that allows instruments to be represented by a different basis function for each fundamental frequency (“shift variance”). Two different types of user input are studied: information about the types of instruments, which enables the use of basis functions from an instrument database, and a manual transcription of a number of notes which enables the template estimation from the data under analysis itself. Experiments are performed on a data set of mixtures of acoustical instruments up to a polyphony of five. The results confirm a significant loss in accuracy when database templates are used and show the superiority of the Kullback-Leibler divergence over the least squares error cost function.","PeriodicalId":6443,"journal":{"name":"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2012-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80348177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信