2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)最新文献

筛选
英文 中文
Analysis of singing voice for epoch extraction using Zero Frequency Filtering method 用零频率滤波方法分析歌唱声音的历元提取
Sudarsana Reddy Kadiri, B. Yegnanarayana
{"title":"Analysis of singing voice for epoch extraction using Zero Frequency Filtering method","authors":"Sudarsana Reddy Kadiri, B. Yegnanarayana","doi":"10.1109/ICASSP.2015.7178774","DOIUrl":"https://doi.org/10.1109/ICASSP.2015.7178774","url":null,"abstract":"Epoch is the instant of significant excitation of the vocal tract system during the production of voiced speech. Estimation of epochs or Glottal closure instants (GCIs) is a well studied topic in the speech analysis. From the recent studies on GCI detection from singing voice with state-of-art methods proposed for speech, there exist a clear gap in accuracy between speech and singing voice. This is because of source-filter interaction in singing voice compared to speech. Performance of existing algorithms deteriorates as most of the techniques depends on the ability to model the vocal tract system in order to emphasize the excitation characteristics in the residual. The objective of this paper is to analyze the singing voice for the estimation of epochs by studying the characteristics of the source-filter interaction and the effect of wider range of pitch using the Zero Frequency Filtering (ZFF) method. It is observed that high source-filter interaction can be captured in the form of the impulse-like excitation by passing the signal through three ideal digital resonators having poles at zero frequency, and the effect of wider range of pitch can be controlled by processing short segment (0.4-0.5 sec) signal.","PeriodicalId":117666,"journal":{"name":"2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121468375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 25
Information extraction from large multi-layer social networks 大型多层次社交网络的信息提取
Brandon Oselio, Alex Kulesza, A. Hero
{"title":"Information extraction from large multi-layer social networks","authors":"Brandon Oselio, Alex Kulesza, A. Hero","doi":"10.1109/ICASSP.2015.7179013","DOIUrl":"https://doi.org/10.1109/ICASSP.2015.7179013","url":null,"abstract":"Social networks often encode community structure using multiple distinct types of links between nodes. In this paper we introduce a novel method to extract information from such multi-layer networks, where each type of link forms its own layer. Using the concept of Pareto optimality, community detection in this multi-layer setting is formulated as a multiple criterion optimization problem. We propose an algorithm for finding an approximate Pareto frontier containing a family of solutions. The power of this approach is demonstrated on a Twitter dataset, where the nodes are hashtags and the layers correspond to (1) behavioral edges connecting pairs of hashtags whose temporal profiles are similar and (2) relational edges connecting pairs of hashtags that appear in the same tweets.","PeriodicalId":117666,"journal":{"name":"2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121594268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Objective quality prediction for haptic texture signal compression 触觉纹理信号压缩的客观质量预测
R. Chaudhari, Yongjae Yoo, Clemens Schuwerk, Seungmoon Choi, E. Steinbach
{"title":"Objective quality prediction for haptic texture signal compression","authors":"R. Chaudhari, Yongjae Yoo, Clemens Schuwerk, Seungmoon Choi, E. Steinbach","doi":"10.1109/ICASSP.2015.7178366","DOIUrl":"https://doi.org/10.1109/ICASSP.2015.7178366","url":null,"abstract":"Perceptual quality for media compression algorithms is traditionally evaluated through user studies. Such studies are time consuming, laborious and expensive, slowing down the development of new signal processing algorithms. To address this problem, a number of algorithmic quality prediction methodologies have been developed in the audio and video fields, something that is currently lacking in haptics research. In this paper, we present a novel method for predicting the perceptual quality degradation of compressed haptic texture signals. For this purpose, abstract perceptual features like Roughness, Brightness, etc. that capture the subjective experience of textures are exploited, in addition to low-level psychophysical models from the literature. As compared to the state-of-the-art, the presented prediction methodology shows an approximately 30% improvement in explaining the variance in the perceptual data.","PeriodicalId":117666,"journal":{"name":"2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"164 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114736968","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Fast and efficient intra coding techniques for smooth regions in screen content coding based on boundary prediction samples 基于边界预测样本的屏幕内容平滑区域快速高效编码技术
Sik-Ho Tsang, Yui-Lam Chan, W. Siu
{"title":"Fast and efficient intra coding techniques for smooth regions in screen content coding based on boundary prediction samples","authors":"Sik-Ho Tsang, Yui-Lam Chan, W. Siu","doi":"10.1109/ICASSP.2015.7178202","DOIUrl":"https://doi.org/10.1109/ICASSP.2015.7178202","url":null,"abstract":"This paper presents fast and efficient intra prediction algorithms for screen content coding (SCC). The proposed algorithms focus on smooth regions frequently appeared in screen content videos, which have the characteristics of noiselessness. All the samples in a noiseless smooth region exhibit exactly the same pixel value. We then propose two intra coding techniques for noiseless smooth regions in SCC based on the smoothness of the boundary samples which are used for intra prediction. Our proposed algorithm can reduce computational complexity by at most 26.7% while keeping nearly the same video quality. Moreover, by removing the redundant coding bits for intra prediction modes, computational complexity can be further reduced to at most 53.3% in terms of encoding time with bitrate reduction up to 1.2%.","PeriodicalId":117666,"journal":{"name":"2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124381255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
Optically visualized sound field reconstruction based on sparse selection of point sound sources 基于点声源稀疏选择的光学可视化声场重建
K. Yatabe, Yasuhiro Oikawa
{"title":"Optically visualized sound field reconstruction based on sparse selection of point sound sources","authors":"K. Yatabe, Yasuhiro Oikawa","doi":"10.1109/ICASSP.2015.7178020","DOIUrl":"https://doi.org/10.1109/ICASSP.2015.7178020","url":null,"abstract":"Visualization is an effective way to understand the behavior of a sound field. There are several methods for such observation including optical measurement technique which enables a non-destructive acoustical observation by detecting density variation of the medium. For audible sound propagating through the air, however, smallness of the variation requires high sensitivity of the measuring system that causes problematic noise contamination. In this paper, a method for reconstructing two-dimensional audible sound fields from noisy optical observation is proposed.","PeriodicalId":117666,"journal":{"name":"2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124388839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Multivariate lattices for encrypted image processing 用于加密图像处理的多元格
A. Pedrouzo-Ulloa, J. Troncoso-Pastoriza, F. Pérez-González
{"title":"Multivariate lattices for encrypted image processing","authors":"A. Pedrouzo-Ulloa, J. Troncoso-Pastoriza, F. Pérez-González","doi":"10.1109/ICASSP.2015.7178262","DOIUrl":"https://doi.org/10.1109/ICASSP.2015.7178262","url":null,"abstract":"Images are inherently sensitive signals that require privacy-preserving solutions when processed in an untrusted environment, but their efficient encrypted processing is particularly challenging due to their structure and size. This work introduces a new cryptographic hard problem called m-RLWE (multivariate Ring Learning with Errors) extending RLWE. It gives support to lattice cryptosystems that allow for encrypted processing of multidimensional signals. We show an example cryptosystem and prove that it outperforms its RLWE counterpart in terms of security against basis-reduction attacks, efficiency and cipher expansion for encrypted image processing.","PeriodicalId":117666,"journal":{"name":"2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"509 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127603827","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Multiple target track-before-detect in compound Gaussian clutter 复合高斯杂波中多目标检测前跟踪
S. P. Ebenezer, A. Papandreou-Suppappola
{"title":"Multiple target track-before-detect in compound Gaussian clutter","authors":"S. P. Ebenezer, A. Papandreou-Suppappola","doi":"10.1109/ICASSP.2015.7178429","DOIUrl":"https://doi.org/10.1109/ICASSP.2015.7178429","url":null,"abstract":"In this paper, we extend the multiple transition mode track- before-detect (TBD) algorithm to track multiple low observable targets in compound Gaussian sea clutter. The proposed TBD framework uses the un-thresholded fast time radar measurements to track multiple targets in low signal-to-clutter ratios (SCRs). The TBD is implemented using particle filtering (PF), and we derive the generalized likelihood ratio needed to update the particle weights. The maximum likelihood estimate of the texture and the covariance matrix of the speckle are also derived and implemented using a fixed point algorithm. The tracking performance of the proposed algorithm is investigated using three low observable targets that enter and leave the field of view (FOV) at different time steps and under varying environmental conditions.","PeriodicalId":117666,"journal":{"name":"2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127717346","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Multi-task deep neural network acoustic models with model adaptation using discriminative speaker identity for whisper recognition 基于区分说话人身份的多任务深度神经网络声学模型用于耳语识别
Jingjie Li, I. Mcloughlin, Cong Liu, Shaofei Xue, Si Wei
{"title":"Multi-task deep neural network acoustic models with model adaptation using discriminative speaker identity for whisper recognition","authors":"Jingjie Li, I. Mcloughlin, Cong Liu, Shaofei Xue, Si Wei","doi":"10.1109/ICASSP.2015.7178916","DOIUrl":"https://doi.org/10.1109/ICASSP.2015.7178916","url":null,"abstract":"This paper presents a study on large vocabulary continuous whisper automatic recognition (wLVCSR). wLVCSR provides the ability to use ASR equipment in public places without concern for disturbing others or leaking private information. However the task of wLVCSR is much more challenging than normal LVCSR due to the absence of pitch which not only causes the signal to noise ratio (SNR) of whispers to be much lower than normal speech but also leads to flatness and formant shifts in whisper spectra. Furthermore, the amount of whisper data available for training is much less than for normal speech. In this paper, multi-task deep neural network (DNN) acoustic models are deployed to solve these problems. Moreover, model adaptation is performed on the multi-task DNN to normalize speaker and environmental variability in whispers based on discriminative speaker identity information. On a Mandarin whisper dictation task, with 55 hours of whisper data, the proposed SI multi-task DNN model can achieve 56.7% character error rate (CER) improvement over a baseline Gaussian Mixture Model (GMM), discriminatively trained only using the whisper data. Besides, the CER of the proposed model for normal speech can reach 15.2%, which is close to the performance of a state-of-the-art DNN trained with one thousand hours of speech data. From this baseline, the model-adapted DNN gains a further 10.9% CER reduction over the generic model.","PeriodicalId":117666,"journal":{"name":"2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127737163","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Doa estimation by covariance matrix sparse reconstruction of coprime array 基于协方差矩阵稀疏重构的协方差阵Doa估计
Chengwei Zhou, Zhiguo Shi, Yujie Gu, N. Goodman
{"title":"Doa estimation by covariance matrix sparse reconstruction of coprime array","authors":"Chengwei Zhou, Zhiguo Shi, Yujie Gu, N. Goodman","doi":"10.1109/ICASSP.2015.7178395","DOIUrl":"https://doi.org/10.1109/ICASSP.2015.7178395","url":null,"abstract":"In this paper, we propose a direction-of-arrival estimation method by covariance matrix sparse reconstruction of coprime array. Specifically, source locations are estimated by solving a newly formulated convex optimization problem, where the difference between the spatially smoothed covariance matrix and the sparsely reconstructed one is minimized. Then, a sliding window scheme is designed for source enumeration. Finally, the power of each source is re-estimated as a least squares problem. Compared with existing methods, the proposed method achieves more accurate source localization and power estimation performance with full utilization of increased degrees of freedom provided by coprime array.","PeriodicalId":117666,"journal":{"name":"2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127754622","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 35
Assistive listening headsets for high noise environments: Protection and communication 高噪音环境用助听耳机:保护和通讯
S. Nordholm, A. Davis, Pei Chee Yong, H. H. Dam
{"title":"Assistive listening headsets for high noise environments: Protection and communication","authors":"S. Nordholm, A. Davis, Pei Chee Yong, H. H. Dam","doi":"10.1109/ICASSP.2015.7179074","DOIUrl":"https://doi.org/10.1109/ICASSP.2015.7179074","url":null,"abstract":"In industrial noise environments, the use of assistive listening headsets is a means to provide adequate access to voice communication while wearing hearing protection. This paper presents a performance evaluation and comparison of two different methods to provide the binaural speech enhancement in real industrial noise scenarios. The investigated binaural methods based on differential beamforming and multichannel Wiener filter show different strengths and weaknesses. A transient noise suppression algorithm is also proposed and evaluated. Performance evaluation shows that this algorithm, together with the binaural multi-channel Wiener filter approach, can successfully reduce the hammering noise. This can be observed from the PESQ scores and the signal characteristics.","PeriodicalId":117666,"journal":{"name":"2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126277381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信