2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)最新文献

筛选
英文 中文
Contour-based 3D tongue motion visualization using ultrasound image sequences 基于轮廓的三维舌头运动可视化超声图像序列
Kele Xu, Yin Yang, Clémence Leboullenger, P. Roussel-Ragot, B. Denby
{"title":"Contour-based 3D tongue motion visualization using ultrasound image sequences","authors":"Kele Xu, Yin Yang, Clémence Leboullenger, P. Roussel-Ragot, B. Denby","doi":"10.1109/ICASSP.2016.7472705","DOIUrl":"https://doi.org/10.1109/ICASSP.2016.7472705","url":null,"abstract":"This article describes a contour-based 3D tongue deformation visualization framework using B-mode ultrasound image sequences. A robust, automatic tracking algorithm characterizes tongue motion via a contour, which is then used to drive a generic 3D Finite Element Model (FEM). A novel contour-based 3D dynamic modeling method is presented. Modal reduction and modal warping techniques are applied to model the deformation of the tongue physically and efficiently. This work can be helpful in a variety of fields, such as speech production, silent speech recognition, articulation training, speech disorder study, etc.","PeriodicalId":165321,"journal":{"name":"2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126795495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Depth map estimation using census transform for light field cameras 基于普查变换的光场相机深度图估计
Takayuki Tomioka, Kazu Mishiba, Y. Oyamada, K. Kondo
{"title":"Depth map estimation using census transform for light field cameras","authors":"Takayuki Tomioka, Kazu Mishiba, Y. Oyamada, K. Kondo","doi":"10.1109/ICASSP.2016.7471955","DOIUrl":"https://doi.org/10.1109/ICASSP.2016.7471955","url":null,"abstract":"Depth estimation for the lense-array type cameras is a challenging problem because of sensor noise and radiometric distortion which is a global brightness change between sub-aperture images caused by a vignetting effect of the micro-lenses. We propose a depth map estimation method which has robustness against the sensor noise and the radiometric distortion. Our method first binarizes sub-aperture images by applying the census transform. Next, the binarized images are matched by computing the majority operations between corresponding bits and summing up the Hamming distance. An initial map obtained by matching has ambiguity caused by extremely short baselines among sub-aperture images. We refine an initial map by the optimization which uses the assumption that the variations of the depth values in the depth map and of the pixel values in the texture-less objects are similar. Experiments show that our method outperforms the conventional methods.","PeriodicalId":165321,"journal":{"name":"2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127005360","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Pushing the limit of non-rigid structure-from-motion by shape clustering 通过形状聚类突破运动非刚性结构的极限
Huizhong Deng, Yuchao Dai
{"title":"Pushing the limit of non-rigid structure-from-motion by shape clustering","authors":"Huizhong Deng, Yuchao Dai","doi":"10.1109/ICASSP.2016.7472027","DOIUrl":"https://doi.org/10.1109/ICASSP.2016.7472027","url":null,"abstract":"Recovering both camera motions and non-rigid 3D shapes from 2D feature tracks is a challenging problem in computer vision. Long-term, complex non-rigid shape variations in real world videos further increase the difficulty for Non-rigid structure-from-motion (NRSfM). Furthermore, there does not exist a criterion to characterize the possibility in recovering the non-rigid shapes and camera motions (i.e., how easy or how difficult the problem could be). In this paper, we first present an analysis to the \"reconstructability\" measure for NRSfM, where we show that 3D shape complexity and camera motion complexity can be used to index the re-constructability. We propose an iterative shape clustering based method to NRSfM, which alternates between 3D shape clustering and 3D shape reconstruction. Thus, the global reconstructability has been improved and better reconstruction can be achieved. Experimental results on long-term, complex non-rigid motion sequences show that our method outperforms the current state-of-the-art methods by a margin.","PeriodicalId":165321,"journal":{"name":"2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129232220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Distributed dyadic cyclic descent for non-negative matrix factorization 非负矩阵分解的分布二进循环下降
M. Ulfarsson, V. Solo, J. Sigurdsson, J. R. Sveinsson
{"title":"Distributed dyadic cyclic descent for non-negative matrix factorization","authors":"M. Ulfarsson, V. Solo, J. Sigurdsson, J. R. Sveinsson","doi":"10.1109/ICASSP.2016.7472489","DOIUrl":"https://doi.org/10.1109/ICASSP.2016.7472489","url":null,"abstract":"Non-negative matrix factorization (NMF) has found use in fields such as remote sensing and computer vision where the signals of interest are usually non-negative. Data dimensions in these applications can be huge and traditional algorithms break down due to unachievable memory demands. One is then compelled to consider distributed algorithms. In this paper, we develop for the first time a distributed version of NMF using the alternating direction method of multipliers (ADMM) algorithm and dyadic cyclic descent. The algorithm is compared to well established variants of NMF using simulated data, and is also evaluated using real remote sensing hyperspectral data.","PeriodicalId":165321,"journal":{"name":"2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"80 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124005338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Improving face detection with depth 改进深度人脸检测
Gregory P. Meyer, Steven Alfano, M. Do
{"title":"Improving face detection with depth","authors":"Gregory P. Meyer, Steven Alfano, M. Do","doi":"10.1109/ICASSP.2016.7471884","DOIUrl":"https://doi.org/10.1109/ICASSP.2016.7471884","url":null,"abstract":"Face detection serves an important role in many computer vision systems. Typically, a face detector identifies faces within a grayscale or color image. Due to the recent increase in consumer depth cameras, obtaining both color and depth images of a scene has never been easier. We propose a technique that utilizes depth information to improve face detection. Standard face detection methods, such as the Viola-Jones object detection framework, detects faces by searching an image at every location and scale. Our method increases the speed and accuracy of the Viola-Jones face detector by utilizing depth data to constrain the detector's search over the image. Leveraging a Kinect camera, we are able to detect faces 3.5× faster, while greatly reducing the amount of false positives.","PeriodicalId":165321,"journal":{"name":"2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124246817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
An improved anthropometry-based customization method of individual head-related transfer functions 一种改进的基于人体测量学的个体头部相关传递函数定制方法
Xuejie Liu, Xiaoli Zhong
{"title":"An improved anthropometry-based customization method of individual head-related transfer functions","authors":"Xuejie Liu, Xiaoli Zhong","doi":"10.1109/ICASSP.2016.7471692","DOIUrl":"https://doi.org/10.1109/ICASSP.2016.7471692","url":null,"abstract":"Individual head-related transfer functions (HRTFs) are necessary for rendering authentic spatial perceptions in spatial audio applications. To obtain individual HRTFs while avoiding tedious and complicated measurement and calculation, an improved customization method based on anthropometry matching is proposed. In the method, a set of HRTFs, which is the best match to the pinna shape of the listener using four pinna-related anatomical parameters, is selected as the listener's individual HRTFs from a pre-acquired HRTF baseline database. A series of subject localization experiments was conducted to verify the effectiveness of the proposed method compared with the existing method. Results show that the median-plane localization performance of the customization method proposed in the present work is prior to the existing method, though performance improvement varies with source position.","PeriodicalId":165321,"journal":{"name":"2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"108 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123417809","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Robust sparsity-promoting acoustic multi-channel equalization for speech dereverberation 用于语音去噪的鲁棒稀疏性声学多通道均衡
I. Kodrasi, Ante Jukic, S. Doclo
{"title":"Robust sparsity-promoting acoustic multi-channel equalization for speech dereverberation","authors":"I. Kodrasi, Ante Jukic, S. Doclo","doi":"10.1109/ICASSP.2016.7471658","DOIUrl":"https://doi.org/10.1109/ICASSP.2016.7471658","url":null,"abstract":"This paper presents a novel signal-dependent method to increase the robustness of acoustic multi-channel equalization techniques against room impulse response (RIR) estimation errors. Aiming at obtaining an output signal which better resembles a clean speech signal, we propose to extend the acoustic multi-channel equalization cost function with a penalty function which promotes sparsity of the output signal in the short-time Fourier transform domain. Two conventionally used sparsity-promoting penalty functions are investigated, i.e., the l0-norm and the l1-norm, and the sparsity-promoting filters are iteratively computed using the alternating direction method of multipliers. Simulation results for several RIR estimation errors show that incorporating a sparsity-promoting penalty function significantly increases the robustness, with the l1-norm penalty function outperforming the l0-norm penalty function.","PeriodicalId":165321,"journal":{"name":"2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121180521","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
A generalized LDPC framework for robust and sublinear compressive sensing 一种用于鲁棒和亚线性压缩感知的广义LDPC框架
Xu Chen, Dongning Guo
{"title":"A generalized LDPC framework for robust and sublinear compressive sensing","authors":"Xu Chen, Dongning Guo","doi":"10.1109/ICASSP.2016.7472553","DOIUrl":"https://doi.org/10.1109/ICASSP.2016.7472553","url":null,"abstract":"Compressive sensing aims to recover a high-dimensional sparse signal from a relatively small number of measurements. In this paper, a novel design of the measurement matrix is proposed. The design is inspired by the construction of generalized low-density parity-check codes, where the capacity-achieving point-to-point codes serve as subcodes to robustly estimate the signal support. In the case that each entry of the n-dimensional ft-sparse signal lies in a known discrete alphabet, the proposed scheme requires only O(k log n) measurements and arithmetic operations. In the case of arbitrary, possibly continuous alphabet, an error propagation graph is proposed to characterize the residual estimation error. With O(k log2 n) measurements and computational complexity, the reconstruction error can be made arbitrarily small with high probability.","PeriodicalId":165321,"journal":{"name":"2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"89 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114238365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Fast intra mode decision and block matching for HEVC screen content compression HEVC屏幕内容压缩的快速模式内决策和块匹配
Hao Zhang, Qiao-Yan Zhou, Ningning Shi, Feng Yang, Xin Feng, Zhan Ma
{"title":"Fast intra mode decision and block matching for HEVC screen content compression","authors":"Hao Zhang, Qiao-Yan Zhou, Ningning Shi, Feng Yang, Xin Feng, Zhan Ma","doi":"10.1109/ICASSP.2016.7471902","DOIUrl":"https://doi.org/10.1109/ICASSP.2016.7471902","url":null,"abstract":"Screen content coding (SCC) is the latest extension of the High-Efficiency Video Coding (HEVC) aiming to improve the compression efficiency of screen content video. With newly developed tools such as intra block copy (IntraBC) and palette (PLT) mode, SCC has been able to compress the desktop screens more efficiently but with significant complexity increase. In this paper, we improve the intra prediction from two aspects. Firstly, by leveraging the temporal correlation among coding units (CU), we develop a fast CU depth prediction scheme. Furthermore, adaptive search step is employed for further speed up of the time-consuming block matching in IntraBC. The overall encoding time is reduced by about 39% and 35% for the All Intra (AI) lossy and lossless encoding scenarios with negligible quality loss under the SCC common test condition.","PeriodicalId":165321,"journal":{"name":"2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116184440","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
A hierarchical framework for language identification 语言识别的层次框架
S. Irtza, V. Sethu, Haris Bavattichalil, E. Ambikairajah, Haizhou Li
{"title":"A hierarchical framework for language identification","authors":"S. Irtza, V. Sethu, Haris Bavattichalil, E. Ambikairajah, Haizhou Li","doi":"10.1109/ICASSP.2016.7472793","DOIUrl":"https://doi.org/10.1109/ICASSP.2016.7472793","url":null,"abstract":"Most current language recognition systems model different levels of information such as acoustic, prosodic, phonotactic, etc. independently and combine the model likelihoods in order to make a decision. However, these are single level systems that treat all languages identically and hence incapable of exploiting any similarities that may exist within groups of languages. In this paper, a hierarchical language identification (HLID) framework is proposed that involves a series of classification decisions at multiple levels involving language clusters of decreasing sizes with individual languages identified only at the final level. The performance of proposed hierarchical framework is compared with a state-of-the-art LID system on the NIST 2007 database and the results indicate that the proposed approach outperforms state-of-the-art systems.","PeriodicalId":165321,"journal":{"name":"2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121510233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信