2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)最新文献

筛选
英文 中文
Scalable Multilevel Quantization for Distributed Detection 分布式检测的可扩展多水平量化
2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2021-06-06 DOI: 10.1109/ICASSP39728.2021.9414032
G. Gul, Michael Basler
{"title":"Scalable Multilevel Quantization for Distributed Detection","authors":"G. Gul, Michael Basler","doi":"10.1109/ICASSP39728.2021.9414032","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9414032","url":null,"abstract":"A scalable algorithm is derived for multilevel quantization of sensor observations in distributed sensor networks, which consist of a number of sensors transmitting a summary information of their observations to the fusion center for a final decision. The proposed algorithm is directly minimizing the overall error probability of the network without resorting to minimizing pseudo objective functions such as distances between probability distributions. The problem formulation makes it possible to consider globally optimum error minimization at the fusion center and a person-by-person optimum quantization at each sensor. The complexity of the algorithm is quasi-linear for i.i.d. sensors. Experimental results indicate that the proposed scheme is superior in comparison to the current state-of-the-art.","PeriodicalId":6443,"journal":{"name":"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"18 1","pages":"5200-5204"},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83677211","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Linear Model-Based Intra Prediction in VVC Test Model 基于线性模型的VVC测试模型内预测
2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2020-01-01 DOI: 10.1109/ICASSP40776.2020.9054405
R. G. Youvalari
{"title":"Linear Model-Based Intra Prediction in VVC Test Model","authors":"R. G. Youvalari","doi":"10.1109/ICASSP40776.2020.9054405","DOIUrl":"https://doi.org/10.1109/ICASSP40776.2020.9054405","url":null,"abstract":"","PeriodicalId":6443,"journal":{"name":"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"38 1","pages":"4417-4421"},"PeriodicalIF":0.0,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87087689","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Practical Concentric Open Sphere Cardioid Microphone Array Design for Higher Order Sound Field Capture 用于高阶声场捕获的实用同心开球心形麦克风阵列设计
P. ThomasMarkR
{"title":"Practical Concentric Open Sphere Cardioid Microphone Array Design for Higher Order Sound Field Capture","authors":"P. ThomasMarkR","doi":"10.1109/ICASSP.2019.8683559","DOIUrl":"https://doi.org/10.1109/ICASSP.2019.8683559","url":null,"abstract":"The problem of higher order sound field capture with spherical microphone arrays is considered. While A-format cardioid designs are commonplace for first order capture, interest remains in the increased spatial resolution delivered by higher order arrays. Spherical arrays typically use omnidirectional microphones mounted on a rigid baffle, from which higher order spatial components are estimated by accounting for radial mode strength. This produces a design trade-off between with small arrays for spatial aliasing performance and large arrays for reduced amplification of instrument noise at low frequencies. A practical open sphere design is proposed that contains cardioid microphones mounted at multiple radii to fulfill both criteria. A design example with a two spheres of 16-channel cardioids at 42 mm and 420 mm radius produces white noise gain above unity on third order components down to 200 Hz, a decade lower than a rigid 32-channel 42 mm sphere of omnidirectional microphones.","PeriodicalId":6443,"journal":{"name":"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"34 1","pages":"666-670"},"PeriodicalIF":0.0,"publicationDate":"2019-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88363317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Trigonometric Interpolation Beamforming for a Circular Microphone Array 圆形传声器阵列的三角插值波束形成
C. Schuldt
{"title":"Trigonometric Interpolation Beamforming for a Circular Microphone Array","authors":"C. Schuldt","doi":"10.1109/ICASSP.2019.8682843","DOIUrl":"https://doi.org/10.1109/ICASSP.2019.8682843","url":null,"abstract":"Polynomial beamforming has previously been proposed for addressing the non-trivial problem of integrating acoustic echo cancellation with adaptive microphone beamforming. This paper demonstrates a design example for a circular array where traditional polynomial beamforming approaches exhibit severe (over 10 dB) directivity index (DI) oscillations at the edges of the design interval, leading to severe DI degradation for certain look directions. A solution, based on trigonometric interpolation, is proposed that stabilizes the oscillations significantly, resulting in a DI that deviates only about 1 dB from that of a fixed beamformer over all look directions.","PeriodicalId":6443,"journal":{"name":"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"143 1","pages":"431-435"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86748592","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Improving ASR Robustness to Perturbed Speech Using Cycle-consistent Generative Adversarial Networks 利用周期一致生成对抗网络提高ASR对扰动语音的鲁棒性
Sri Harsha Dumpala, I. Sheikh, Rupayan Chakraborty, Sunil Kumar Kopparapu
{"title":"Improving ASR Robustness to Perturbed Speech Using Cycle-consistent Generative Adversarial Networks","authors":"Sri Harsha Dumpala, I. Sheikh, Rupayan Chakraborty, Sunil Kumar Kopparapu","doi":"10.1109/ICASSP.2019.8683793","DOIUrl":"https://doi.org/10.1109/ICASSP.2019.8683793","url":null,"abstract":"Naturally introduced perturbations in audio signal, caused by emotional and physical states of the speaker, can significantly degrade the performance of Automatic Speech Recognition (ASR) systems. In this paper, we propose a front-end based on Cycle-Consistent Generative Adversarial Network (CycleGAN) which transforms naturally perturbed speech into normal speech, and hence improves the robustness of an ASR system. The CycleGAN model is trained on non-parallel examples of perturbed and normal speech. Experiments on spontaneous laughter-speech and creaky voice datasets show that the performance of four different ASR systems improve by using speech obtained from CycleGAN based front-end, as compared to directly using the original perturbed speech. Visualization of the features of the laughter perturbed speech and those generated by the proposed front-end further demonstrates the effectiveness of our approach.","PeriodicalId":6443,"journal":{"name":"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"40 1","pages":"5726-5730"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78971615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Blind Quality Evaluator for Screen Content Images via Analysis of Structure 基于结构分析的屏幕内容图像质量盲评价方法
Guanghui Yue, Chunping Hou, Weisi Lin
{"title":"Blind Quality Evaluator for Screen Content Images via Analysis of Structure","authors":"Guanghui Yue, Chunping Hou, Weisi Lin","doi":"10.1109/ICASSP.2019.8682371","DOIUrl":"https://doi.org/10.1109/ICASSP.2019.8682371","url":null,"abstract":"Existing blind evaluators for screen content images (SCIs) are mainly learning-based and require a number of training images with co-registered human opinion scores. However, the size of existing databases is small, and it is labor-, time-consuming and expensive to largely generate human opinion scores. In this study, we propose a novel blind quality evaluator without training. Specifically, the proposed method first calculates the gradient similarity between a distorted image and its translated versions in four directions to estimate the structural distortion, the most obvious distortion in SCIs. Given that the edge region is easier to be distorted, the inter-scale gradient similarity is then calculated as the weighting map. Finally, the proposed method is derived by incorporating the gradient similarity map with the weighting map. Experimental results demonstrate its effectiveness and efficiency on a public available SCI database.","PeriodicalId":6443,"journal":{"name":"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"63 1","pages":"4050-4054"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84998965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Human Behaviour Recognition Using Wifi Channel State Information 利用Wifi信道状态信息进行人类行为识别
Daanish Ali Khan, Saquib Razak, B. Raj, Rita Singh
{"title":"Human Behaviour Recognition Using Wifi Channel State Information","authors":"Daanish Ali Khan, Saquib Razak, B. Raj, Rita Singh","doi":"10.1109/ICASSP.2019.8682821","DOIUrl":"https://doi.org/10.1109/ICASSP.2019.8682821","url":null,"abstract":"Device-Free Human Behaviour Recognition is automatically recognizing physical activity from a series of observations, without directly attaching sensors to the subject. Behaviour Recognition has applications in security, health-care, and smart homes. The ubiquity of WiFi devices has generated recent interest in Channel State Information (CSI) that describes the propagation of RF signals for behaviour recognition, leveraging the relationship between body movement and variations in CSI streams. Existing work on CSI based behaviour recognition has established the efficacy of deep neural network classifiers, yielding performance that surpasses traditional techniques. In this paper, we propose a deep Recurrent Neural Network (RNN) model for CSI based Behaviour Recognition that utilizes a Convolutional Neural Network (CNN) feature extractor with stacked Long Short-Term Memory (LSTM) networks for sequence classification. We also examine CSI de-noising techniques that allow faster training and model convergence. Our model has yielded significant improvement in classification accuracy, compared to existing techniques.","PeriodicalId":6443,"journal":{"name":"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"91 1","pages":"7625-7629"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83267319","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
A Novel Deterministic Sensing Matrix Based on Kasami Codes for Cluster Structured Sparse Signals 基于Kasami码的聚类结构稀疏信号确定性感知矩阵
Hamid Nouasria, Mohamed Et-tolba
{"title":"A Novel Deterministic Sensing Matrix Based on Kasami Codes for Cluster Structured Sparse Signals","authors":"Hamid Nouasria, Mohamed Et-tolba","doi":"10.1109/ICASSP.2019.8683593","DOIUrl":"https://doi.org/10.1109/ICASSP.2019.8683593","url":null,"abstract":"Cluster structured compressive sensing is a new direction of compressive sensing, dealing with cluster structured sparse signals. In this paper, we propose a sensing matrix based on Kasami codes for CSS signals. The Kasami codes have been the subject of several constructions. Our idea is to make these constructions suitable to CSS signals. The proposed matrix, gives more intention to the clusters. Simulation results show the superior performance of our matrix. In that, it gives the highest rate of exact recovery. Moreover, the deterministic aspect of our matrix makes it more suitable to be implemented on hardware.","PeriodicalId":6443,"journal":{"name":"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"53 1","pages":"1592-1596"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81173424","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Embedding Physical Augmentation and Wavelet Scattering Transform to Generative Adversarial Networks for Audio Classification with Limited Training Resources 在训练资源有限的音频分类中嵌入物理增强和小波散射变换的生成对抗网络
Kah Kuan Teh, T. H. Dat
{"title":"Embedding Physical Augmentation and Wavelet Scattering Transform to Generative Adversarial Networks for Audio Classification with Limited Training Resources","authors":"Kah Kuan Teh, T. H. Dat","doi":"10.1109/ICASSP.2019.8683199","DOIUrl":"https://doi.org/10.1109/ICASSP.2019.8683199","url":null,"abstract":"This paper addresses audio classification with limited training resources. We first investigate different types of data augmentation including physical modeling, wavelet scattering transform and Generative Adversarial Networks (GAN). We than propose a novel GAN method to embed physical augmentation and wavelet scattering transform in processing. The experimental results on Google Speech Command show significant improvements of the proposed method when training with limited resources. It could lift up classification accuracy from the best baselines of 62.06% and 77.29% on ResNet, to as far as 91.96% and 93.38%, when training with 10% and 25% training data, respectively.","PeriodicalId":6443,"journal":{"name":"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"11 1","pages":"3262-3266"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75273590","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Joint parameter and state estimation for wave-based imaging and inversion 基于波的成像与反演联合参数与状态估计
T. Leeuwen
{"title":"Joint parameter and state estimation for wave-based imaging and inversion","authors":"T. Leeuwen","doi":"10.1109/ICASSP.2017.7953350","DOIUrl":"https://doi.org/10.1109/ICASSP.2017.7953350","url":null,"abstract":"In many applications, such as exploration geophysics, seismology and ultrasound imaging, waves are harnessed to image the interior of an object. We can pose the image formation process as a non-linear data-fitting problem: fit the coefficients of a wave-equation such that its solution fits the observations approximately. This allows one to effectively deal with errors in the observations.","PeriodicalId":6443,"journal":{"name":"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"348 1","pages":"6210-6214"},"PeriodicalIF":0.0,"publicationDate":"2017-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77625345","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信