2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).最新文献

筛选
英文 中文
A method of generating uniformly distributed sequences over [0,K], where K+1 is not a power of two 在[0,K]上生成均匀分布序列的一种方法,其中K+1不是2的幂
R. Kuehnel, Yuke Wang
{"title":"A method of generating uniformly distributed sequences over [0,K], where K+1 is not a power of two","authors":"R. Kuehnel, Yuke Wang","doi":"10.1109/ICASSP.2003.1202488","DOIUrl":"https://doi.org/10.1109/ICASSP.2003.1202488","url":null,"abstract":"A new methodology has been recently proposed for the efficient generation of multiple pseudo-random bit sequences that are statistically uncorrelated [1]. Random sequences that are uniformly distributed over a range [0,K], where K+1 is a power of 2, can be constructed by forming a vector of M independent bit sequences, where M=log/sub 2/ (K+1). We demonstrate that this method of construction represents a special case of a more generalized approach in which K can be any positive integer. The procedures described here can be used to efficiently generate multiple independent random sequences that are uniformly distributed over any range.","PeriodicalId":104473,"journal":{"name":"2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123669697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Time-domain method for tracking dispersive channels in MIMO OFDM systems MIMO OFDM系统中色散信道的时域跟踪方法
T. Roman, M. Enescu, V. Koivunen
{"title":"Time-domain method for tracking dispersive channels in MIMO OFDM systems","authors":"T. Roman, M. Enescu, V. Koivunen","doi":"10.1109/ICASSP.2003.1202662","DOIUrl":"https://doi.org/10.1109/ICASSP.2003.1202662","url":null,"abstract":"In this paper we address the problem of channel estimation for multiple-input multiple-output OFDM systems for mobile users. A channel tracking and equalization method stemming from Kalman filtering is proposed for time-frequency selective channels. Tracking of the MIMO channel matrix is performed in the time-domain and equalization in the frequency domain. The computational complexity is significantly reduced by applying the matrix inversion lemma. Simulation results are presented using a realistic channel model in typical urban scenarios.","PeriodicalId":104473,"journal":{"name":"2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125756902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Schemes for error resilient streaming of perceptually coded audio 感知编码音频的容错流方案
J. Korhonen, Ye-Kui Wang
{"title":"Schemes for error resilient streaming of perceptually coded audio","authors":"J. Korhonen, Ye-Kui Wang","doi":"10.1109/ICASSP.2003.1200077","DOIUrl":"https://doi.org/10.1109/ICASSP.2003.1200077","url":null,"abstract":"This paper presents novel extensions to our earlier system for streaming perceptually coded audio over error prone channels such as Mobile IP. To improve error robustness while maintaining bandwidth efficiency, the new extensions combine the strength of an error resilient coding scheme in the sender, a prioritized packet transport scheme in the network and a compressed domain error concealment strategy in the terminal. Different concealment methods are used for each part of the coded audio data according to their perceptual importance and statistical characteristics. In our current implementation, we employed MPEG-2 Advanced Audio Coding (AAC) encoded bitstreams and an RTP/UDP-based test system for performance evaluation. Simulation results have shown that our improved streaming system is more robust against packet losses in comparison with conventional methods.","PeriodicalId":104473,"journal":{"name":"2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127967914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
A comparison of subspace analysis for face recognition 子空间分析在人脸识别中的比较
Jian Li, S. Zhou, C. Shekhar
{"title":"A comparison of subspace analysis for face recognition","authors":"Jian Li, S. Zhou, C. Shekhar","doi":"10.1109/ICASSP.2003.1199122","DOIUrl":"https://doi.org/10.1109/ICASSP.2003.1199122","url":null,"abstract":"We report the results of a comparative study on subspace analysis methods for face recognition. In particular, we have studied four different subspace representations and their 'kernelized' versions if available. They include both unsupervised methods such as principal component analysis (PCA) and independent component analysis (ICA), and supervised methods such as Fisher discriminant analysis (FDA) and probabilistic PCA (PPCA) used in a discriminative manner. The 'kernelized' versions of these methods provide subspaces of high-dimensional feature spaces induced by non-linear mappings. To test the effectiveness of these subspace representations, we experiment on two databases with three typical variations of face images, i.e, pose, illumination and facial expression changes. The comparison of these methods applied to different variations in face images offers a comprehensive view of all the subspace methods currently used in face recognition.","PeriodicalId":104473,"journal":{"name":"2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116087203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
HMM-neural network monophone models for computer-based articulation training for the hearing impaired 基于计算机的听力障碍发音训练的神经网络单声道模型
M. Devarajan, Fansheng Meng, P. Hix, S. Zahorian
{"title":"HMM-neural network monophone models for computer-based articulation training for the hearing impaired","authors":"M. Devarajan, Fansheng Meng, P. Hix, S. Zahorian","doi":"10.1109/ICASSP.2003.1202373","DOIUrl":"https://doi.org/10.1109/ICASSP.2003.1202373","url":null,"abstract":"A visual speech training aid for persons with hearing impairments has been developed using a Windows-based multimedia computer. Previous papers (Zahorian, S. et al., Int. Conf. on Spoken Language Processing, 2002; Zahorian and Nossair, Z.B., IEEE Trans. on Speech and Audio Processing, vol.7, no.4, p.414-25, 1999; Zimmer, A. et al., ICASSP, vol.6, p.3625-8, 1998; Zahorian and Jagharghi, A., J. Acoust. Soc. Amer., vol.94, no.4, p.1966-82, 1993) have describe the signal processing steps and display options for giving real-time feedback about the quality of pronunciation for 10 steady-state American English monopthong vowels (/aa/, /iy/, /uw/, /ae/, /er/, /ih/, /eh/, /ao/, /ah/, and /uh/). This vowel training aid is thus referred to as a vowel articulation training aid (VATA). We now describe methods to develop a monophone-based hidden Markov model/neural network recognizer such that real time visual feedback can be given about the quality of pronunciation of short words and phrases. Experimental results are reported which indicate a high degree of accuracy for labeling and segmenting the CVC (consonant-vowel-consonant) database developed for \"training\" the display.","PeriodicalId":104473,"journal":{"name":"2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128067532","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Watermarking of 3D models using principal component analysis 基于主成分分析的三维模型水印
Andreas Kalivas, A. Tefas, I. Pitas
{"title":"Watermarking of 3D models using principal component analysis","authors":"Andreas Kalivas, A. Tefas, I. Pitas","doi":"10.1109/ICASSP.2003.1200061","DOIUrl":"https://doi.org/10.1109/ICASSP.2003.1200061","url":null,"abstract":"A novel method for 3D model watermarking, robust to geometric distortions such as rotation, translation and scaling, is proposed. A ternary watermark is embedded in the vertex topology of a 3D model. A transformation of the model to an invariant space is proposed prior to watermark embedding. Simulation results indicate the ability of the proposed method to deal with the aforementioned attacks giving very good results.","PeriodicalId":104473,"journal":{"name":"2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131899850","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 59
Using phone and diphone based acoustic models for voice conversion: a step towards creating voice fonts 使用基于电话和diphone的声学模型进行语音转换:朝着创建语音字体迈出了一步
Arun Kumar, Ashish Verma
{"title":"Using phone and diphone based acoustic models for voice conversion: a step towards creating voice fonts","authors":"Arun Kumar, Ashish Verma","doi":"10.1109/ICASSP.2003.1198882","DOIUrl":"https://doi.org/10.1109/ICASSP.2003.1198882","url":null,"abstract":"Voice conversion techniques attempt to modify the speech signal so that it is perceived as if spoken by another speaker, different from the original speaker. In this paper, we present a novel approach to perform voice conversion. Our approach uses acoustic models based on units of speech, like phones and diphones, for voice conversion. These models can be computed and used independently for a given speaker without being concerned about the source or target speaker. It avoids the use of a parallel speech corpus in the voices of source and target speakers. It is shown that by using the proposed approach, voice fonts can be created and stored which represent individual characteristics of a particular speaker, to be used for customization of synthetic speech. We also show through objective and subjective tests, that voice conversion quality is comparable to other approaches that require a parallel speech corpus.","PeriodicalId":104473,"journal":{"name":"2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124944972","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
A probabilistic approach for blind source separation of underdetermined convolutive mixtures 欠定卷积混合信号盲源分离的概率方法
J. M. Peterson, S. Kadambe
{"title":"A probabilistic approach for blind source separation of underdetermined convolutive mixtures","authors":"J. M. Peterson, S. Kadambe","doi":"10.1109/ICASSP.2003.1201748","DOIUrl":"https://doi.org/10.1109/ICASSP.2003.1201748","url":null,"abstract":"There are very few techniques that can separate signals from the convolutive mixture in the underdetermined case. We have developed a method that uses overcomplete expansion of the signal created with a time-frequency transform and that also uses the property of sparseness and a Laplacian source density model to obtain the source signals from the instantaneously mixed signals in the underdetermined case. This technique has been extended here to separate signals (a) in the case of underdetermined convolutive mixtures, and (b) in the general case of more than 2 mixtures. Here, we also propose a geometric constrained based search approach to significantly reduce the computational time of our original \"dual update\" algorithm. Several examples are provided. The results of signal separation from the convolutive mixtures indicate that an average signal to noise ratio improvement of 5.3 dB can be obtained.","PeriodicalId":104473,"journal":{"name":"2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122169947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Oscillatory gestures and discourse 摇摆的手势和话语
Francis K. H. Quek, Yingen Xiong
{"title":"Oscillatory gestures and discourse","authors":"Francis K. H. Quek, Yingen Xiong","doi":"10.1109/ICASSP.2003.1200090","DOIUrl":"https://doi.org/10.1109/ICASSP.2003.1200090","url":null,"abstract":"Gesture and speech are part of a single human language system. They are co-expressive and complementary channels in the act of speaking. While speech carries the major load of symbolic presentation, gesture provides the imagistic content. Proceeding from the established contemporality of gesture and speech, we discuss our work on oscillatory gestures and speech. We present our wavelet-based approach in gestural oscillation extraction as geodesic ridges in frequency-time space. We motivate the potential of such computational cross-modal language analysis by performing a micro analysis of a video dataset in which a subject describes her living space. We demonstrate the ability of our algorithm to extract gestural oscillations and show how oscillatory gestures reveal portions of the discourse structure.","PeriodicalId":104473,"journal":{"name":"2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123924764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Unconstrained motion compensated temporal filtering (UMCTF) framework for wavelet video coding 小波视频编码的无约束运动补偿时序滤波框架
M. Schaar, D. Turaga
{"title":"Unconstrained motion compensated temporal filtering (UMCTF) framework for wavelet video coding","authors":"M. Schaar, D. Turaga","doi":"10.1109/ICASSP.2003.1199112","DOIUrl":"https://doi.org/10.1109/ICASSP.2003.1199112","url":null,"abstract":"The paper presents a new framework for adaptive temporal filtering in wavelet interframe codecs, called unconstrained motion compensated temporal filtering (UMCTF). This framework allows flexible and efficient temporal filtering by combining the best features of motion compensation, used in predictive coding, with the advantages of interframe scalable wavelet video coding schemes. UMCTF provides higher coding efficiency, improved visual quality and flexibility of temporal and spatial scalability, higher coding efficiency and lower decoding delay than conventional MCTF schemes. Furthermore, UMCTF can also be employed in alternative open-loop scalable coding frameworks using DCT for the texture coding.","PeriodicalId":104473,"journal":{"name":"2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).","volume":"257 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123965254","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信