2013 IEEE International Conference on Acoustics, Speech and Signal Processing最新文献

筛选
英文 中文
Multi-task learning in deep neural networks for improved phoneme recognition 基于深度神经网络的多任务学习改进音素识别
2013 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2013-05-26 DOI: 10.1109/ICASSP.2013.6639012
M. Seltzer, J. Droppo
{"title":"Multi-task learning in deep neural networks for improved phoneme recognition","authors":"M. Seltzer, J. Droppo","doi":"10.1109/ICASSP.2013.6639012","DOIUrl":"https://doi.org/10.1109/ICASSP.2013.6639012","url":null,"abstract":"In this paper we demonstrate how to improve the performance of deep neural network (DNN) acoustic models using multi-task learning. In multi-task learning, the network is trained to perform both the primary classification task and one or more secondary tasks using a shared representation. The additional model parameters associated with the secondary tasks represent a very small increase in the number of trained parameters, and can be discarded at runtime. In this paper, we explore three natural choices for the secondary task: the phone label, the phone context, and the state context. We demonstrate that, even on a strong baseline, multi-task learning can provide a significant decrease in error rate. Using phone context, the phonetic error rate (PER) on TIMIT is reduced from 21.63% to 20.25% on the core test set, and surpassing the best performance in the literature for a DNN that uses a standard feed-forward network architecture.","PeriodicalId":183968,"journal":{"name":"2013 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124679777","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 244
Large-scale malware classification using random projections and neural networks 基于随机投影和神经网络的大规模恶意软件分类
2013 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2013-05-26 DOI: 10.1109/ICASSP.2013.6638293
George E. Dahl, J. W. Stokes, L. Deng, Dong Yu
{"title":"Large-scale malware classification using random projections and neural networks","authors":"George E. Dahl, J. W. Stokes, L. Deng, Dong Yu","doi":"10.1109/ICASSP.2013.6638293","DOIUrl":"https://doi.org/10.1109/ICASSP.2013.6638293","url":null,"abstract":"Automatically generated malware is a significant problem for computer users. Analysts are able to manually investigate a small number of unknown files, but the best large-scale defense for detecting malware is automated malware classification. Malware classifiers often use sparse binary features, and the number of potential features can be on the order of tens or hundreds of millions. Feature selection reduces the number of features to a manageable number for training simpler algorithms such as logistic regression, but this number is still too large for more complex algorithms such as neural networks. To overcome this problem, we used random projections to further reduce the dimensionality of the original input space. Using this architecture, we train several very large-scale neural network systems with over 2.6 million labeled samples thereby achieving classification results with a two-class error rate of 0.49% for a single neural network and 0.42% for an ensemble of neural networks.","PeriodicalId":183968,"journal":{"name":"2013 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"100 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124707348","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 413
Improved estimation of EEG evoked potentials by jitter compensation and enhancing spatial filters 利用抖动补偿和增强空间滤波改进脑电诱发电位估计
2013 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2013-05-26 DOI: 10.1109/ICASSP.2013.6637845
A. Souloumiac, B. Rivet
{"title":"Improved estimation of EEG evoked potentials by jitter compensation and enhancing spatial filters","authors":"A. Souloumiac, B. Rivet","doi":"10.1109/ICASSP.2013.6637845","DOIUrl":"https://doi.org/10.1109/ICASSP.2013.6637845","url":null,"abstract":"We propose in this paper a new technique to investigate the Event-Related Potentials, or Evoked-Response Potentials, in the electroencephalographic signal. The multidimensional electroencephalographic signal is first spatially filtered to enhance the Evoked-Response Potentials using the xDAWN algorithm and, second, the single trial latencies (whatever their origins: physiological or electronical) are estimated by maximizing a cross correlation without any a priori model. The performance of this approach is illustrated on two classical P300-Speller electroencephalographic databases (BCI Competition II and III). The single-trial distribution of P300 Evoked-Response Potential is deblurred using the proposed resynchronization algorithm for applications in particular to Brain Computer Interfaces.","PeriodicalId":183968,"journal":{"name":"2013 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124716163","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Cooperative spectrum sharing with joint receiver decoding 联合接收机译码的协同频谱共享
2013 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2013-05-26 DOI: 10.1109/ICASSP.2013.6638674
Songze Li, U. Mitra, A. Pandharipande
{"title":"Cooperative spectrum sharing with joint receiver decoding","authors":"Songze Li, U. Mitra, A. Pandharipande","doi":"10.1109/ICASSP.2013.6638674","DOIUrl":"https://doi.org/10.1109/ICASSP.2013.6638674","url":null,"abstract":"We consider a spectrum sharing protocol wherein the primary and secondary transmitters cooperatively relay each other's message. Transmission is done in two phases, with each transmitter attempting to decode messages from the other system transmission in a first phase. The second phase transmission consists of the decoded message superposed onto its own message. Priority is given to the primary system transmissions by having the primary message always transmitted over the two phases, while the secondary message is transmitted depending on successful decoding. We consider the scenario where the primary and secondary receivers are co-located, forming a virtual two-antenna receiver. We assess the performance of the system in terms of outage probability and characterize performance corresponding to each state of the Markov chain that governs the proposed transmission protocol. We show that joint decoding offers a 20 dB performance improvement over separate decoding for the primary user and 1.8 dB for the secondary user.","PeriodicalId":183968,"journal":{"name":"2013 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"178 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124744432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Backwards-compatible error propagation recovery for the amr codec over erasure channels 在擦除信道上的amr编解码器的向后兼容错误传播恢复
2013 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2013-05-26 DOI: 10.1109/ICASSP.2013.6639258
A. Gómez, J. L. Pérez-Córdoba, B. Geiser
{"title":"Backwards-compatible error propagation recovery for the amr codec over erasure channels","authors":"A. Gómez, J. L. Pérez-Córdoba, B. Geiser","doi":"10.1109/ICASSP.2013.6639258","DOIUrl":"https://doi.org/10.1109/ICASSP.2013.6639258","url":null,"abstract":"This paper presents a recovery scheme for the error-propagation distortion which frequently appears after a frame erasure in CELP-based speech coders, in particular the AMR codec. The extensive use of predictive filters and parameter encoding allow a high-quality speech synthesis in these codecs, but makes them more vulnerable to frame erasures. Thus, when a frame is lost, an additional distortion appears in the subsequent frame, although that was correctly received, further degrading the speech quality. This degradation can also propagate over several frames, being even more damaging than the loss itself. This well known fact has motivated the development of techniques which prevent or mitigate the error propagation. Nevertheless, the previously proposed methods in some respect modify the transmission scheme (by including additional frames, FEC codes, etc.) making them incompatible with the original decoder. In this work, we apply a steganographic technique to embed recovery data to assist the decoder after a frame loss. This data mainly consist of resynchronization pulses and correction vectors for the excitation signal and the spectral envelope, respectively. PESQ results confirm that our proposal achieves a higher robustness against error propagation while the full backwards-compatibility with the AMR standard is retained.","PeriodicalId":183968,"journal":{"name":"2013 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124785327","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Recent advances in deep learning for speech research at Microsoft 微软语音研究中深度学习的最新进展
2013 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2013-05-26 DOI: 10.1109/ICASSP.2013.6639345
L. Deng, Jinyu Li, J. Huang, K. Yao, Dong Yu, F. Seide, M. Seltzer, G. Zweig, Xiaodong He, J. Williams, Y. Gong, A. Acero
{"title":"Recent advances in deep learning for speech research at Microsoft","authors":"L. Deng, Jinyu Li, J. Huang, K. Yao, Dong Yu, F. Seide, M. Seltzer, G. Zweig, Xiaodong He, J. Williams, Y. Gong, A. Acero","doi":"10.1109/ICASSP.2013.6639345","DOIUrl":"https://doi.org/10.1109/ICASSP.2013.6639345","url":null,"abstract":"Deep learning is becoming a mainstream technology for speech recognition at industrial scale. In this paper, we provide an overview of the work by Microsoft speech researchers since 2009 in this area, focusing on more recent advances which shed light to the basic capabilities and limitations of the current deep learning technology. We organize this overview along the feature-domain and model-domain dimensions according to the conventional approach to analyzing speech systems. Selected experimental results, including speech recognition and related applications such as spoken dialogue and language modeling, are presented to demonstrate and analyze the strengths and weaknesses of the techniques described in the paper. Potential improvement of these techniques and future research directions are discussed.","PeriodicalId":183968,"journal":{"name":"2013 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"101 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124793502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 770
“Wow!” Bayesian surprise for salient acoustic event detection “哇!”显著声事件检测的贝叶斯惊讶度
2013 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2013-05-26 DOI: 10.1109/ICASSP.2013.6638898
Boris Schauerte, R. Stiefelhagen
{"title":"“Wow!” Bayesian surprise for salient acoustic event detection","authors":"Boris Schauerte, R. Stiefelhagen","doi":"10.1109/ICASSP.2013.6638898","DOIUrl":"https://doi.org/10.1109/ICASSP.2013.6638898","url":null,"abstract":"We extend our previous work and present how Bayesian surprise can be applied to detect salient acoustic events. Therefore, we use the Gamma distribution to model each frequencies spectrogram distribution. Then, we use the Kullback-Leibler divergence of the posterior and prior distribution to calculate how “unexpected” and thus surprising newly observed audio samples are. This way, we are able to efficiently detect arbitrary, unexpected and thus surprising acoustic events. Complementing our qualitative system evaluations for (humanoid) robots, we demonstrate the effectiveness and practical applicability of the approach on the CLEAR 2007 acoustic event detection data.","PeriodicalId":183968,"journal":{"name":"2013 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124973361","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Joint source localization and sensor position refinement for sensor networks 传感器网络联合源定位与传感器位置优化
2013 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2013-05-26 DOI: 10.1109/ICASSP.2013.6638415
Ming Sun, Zhenhua Ma, K. C. Ho
{"title":"Joint source localization and sensor position refinement for sensor networks","authors":"Ming Sun, Zhenhua Ma, K. C. Ho","doi":"10.1109/ICASSP.2013.6638415","DOIUrl":"https://doi.org/10.1109/ICASSP.2013.6638415","url":null,"abstract":"Modern localization systems/platforms such as sensor networks often experience uncertainty in the sensor positions. Improving the sensor positions is necessary in order to achieve better localization performance. This paper proposes a joint estimator for locating multiple unknown sources and refining the sensor positions using TOA measurements. Rather than resorting to the traditional iterative nonlinear least-squares approach that requires careful initializations, the proposed estimator is algebraic and computationally attractive. The small noise analysis shows that the proposed estimator is able to attain the CRLB performance for both the unknown sources and the sensor positions. Simulations support the efficiency of the proposed estimator.","PeriodicalId":183968,"journal":{"name":"2013 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129439871","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Interactive fusion in distributed detection: Architecture and performance analysis 分布式检测中的交互式融合:体系结构和性能分析
2013 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2013-05-26 DOI: 10.1109/ICASSP.2013.6638463
E. Akofor, Biao Chen
{"title":"Interactive fusion in distributed detection: Architecture and performance analysis","authors":"E. Akofor, Biao Chen","doi":"10.1109/ICASSP.2013.6638463","DOIUrl":"https://doi.org/10.1109/ICASSP.2013.6638463","url":null,"abstract":"Within the Neyman-Pearson framework we investigate the effect of feedback in two-sensor tandem fusion networks with conditionally independent observations. While there is noticeable improvement in performance of the fixed sample size Neyman-Pearson (NP) test, it is shown that feedback has no effect on the asymptotic performance characterized by the Kullback-Leibler (KL) distance. The result can be extended to an interactive fusion system where the fusion center and the sensor may undergo multiple steps of interactions.","PeriodicalId":183968,"journal":{"name":"2013 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129466619","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Optimal counterforensics for histogram-based forensics 基于直方图的最佳反取证
2013 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2013-05-26 DOI: 10.1109/ICASSP.2013.6638218
Pedro Comesaña Alfaro, F. Pérez-González
{"title":"Optimal counterforensics for histogram-based forensics","authors":"Pedro Comesaña Alfaro, F. Pérez-González","doi":"10.1109/ICASSP.2013.6638218","DOIUrl":"https://doi.org/10.1109/ICASSP.2013.6638218","url":null,"abstract":"There has been a recent interest in counterforensics as an adversarial approach to forensic detectors. Most of the existing counterforensics strategies, although successful, are based on heuristic criteria, and their optimality is not proven. In this paper the optimal modification strategy of a content in order to fool a histogram-based forensics detector is derived. The proposed attack relies on the assumption of a convex cost function; special attention is paid to the Euclidean norm, obtaining the optimal attack in the MSE sense. In order to prove the usefulness of the proposed strategy, we employ it to successfully attack a well-known algorithm for detecting double JPEG compression.","PeriodicalId":183968,"journal":{"name":"2013 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"336 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129486588","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 33
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信