{"title":"Blind code timing and carrier offset estimation for DS-CDMA systems","authors":"K. Amleh, Hongbin Li","doi":"10.1109/ICASSP.2003.1202714","DOIUrl":"https://doi.org/10.1109/ICASSP.2003.1202714","url":null,"abstract":"We consider the problem of joint carrier offset and code timing estimation for CDMA (code division multiple access) systems. In contrast to most existing schemes which require a multi-dimensional search over the parameter space, we propose a blind estimator that solves the joint estimation problem algebraically. By exploiting the noise subspace of the covariance matrix of the received data, the multiuser estimation is decoupled into parallel estimations of individual users, which makes computations efficient. The proposed estimator is non-iterative and near-far resistant. It can deal with frequency-selective and time-varying channels. The performance of the proposed scheme is illustrated by some computer simulations.","PeriodicalId":104473,"journal":{"name":"2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124119069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Time delay estimation and signal reconstruction using multi-rate measurements","authors":"O. Jahromi, P. Aarabi","doi":"10.1109/ICASSP.2003.1201631","DOIUrl":"https://doi.org/10.1109/ICASSP.2003.1201631","url":null,"abstract":"This paper considers the problem of fusing two low-rate sensors (e.g., microphones) for reconstructing one high-resolution signal when time delay of arrival (TDOA) is present as well. We show that under certain conditions the phase of the cross-spectrum-density of low-rate measurements becomes independent of the signal in the high-rate front end of the system. We then utilize this fact to demonstrate that it is possible to extend a class of TDOA estimation techniques known as the generalized cross correlation technique to linear-phase multi-rate sensor systems. Finally, we illustrate how the combination of the theory of linear-phase multirate filter banks and TDOA estimation can result in a practical, multi-sensor signal reconstruction system.","PeriodicalId":104473,"journal":{"name":"2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).","volume":"100 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126936321","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Partitioned vector quantization: application to lossless compression of hyperspectral images","authors":"G. Motta, F. Rizzo, J. Storer","doi":"10.1109/ICASSP.2003.1199152","DOIUrl":"https://doi.org/10.1109/ICASSP.2003.1199152","url":null,"abstract":"A novel design for a vector quantizer that uses multiple codebooks of variable dimensionality is proposed. High dimensional source vectors are first partitioned into two or more subvectors of (possibly) different length and then, each subvector is individually encoded with an appropriate codebook. Further redundancy is exploited by conditional entropy coding of the subvectors indices. This scheme allows practical quantization of high dimensional vectors in which each vector component is allowed to have different alphabet and distribution. This is typically the case of the pixels representing a hyperspectral image. We present experimental results in the lossless and near-lossless encoding of such images. The method can be easily adapted to lossy coding.","PeriodicalId":104473,"journal":{"name":"2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128416745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Low-power hybrid structure of digital matched filters for direct sequence spread spectrum systems","authors":"Sung-Won Lee, I. Park","doi":"10.1109/ICASSP.2003.1202459","DOIUrl":"https://doi.org/10.1109/ICASSP.2003.1202459","url":null,"abstract":"The paper presents a low-power structure of digital matched filters (DMFs), which is proposed for direct sequence spread spectrum systems. Traditionally, low-power approaches for DMFs are based on either the transposed-form structure or the direct-form one. A new hybrid structure that employs the direct-form structure for local addition and the transposed-form structure for global addition is used to take advantage of both structures. For a 128-tap DMF, the proposed DMF that processes 32 addends a cycle consumes 46% less power at the expense of 6% area overhead as compared to the state-of-the-art low-power DMF (Liou, M. and Chiueh, T., IEEE J. Solid-State Circuits, vol.31, p.933-43, 2001).","PeriodicalId":104473,"journal":{"name":"2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132155374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Detection of EEG basic rhythm feature by using band relative intensity ratio (BRIR)","authors":"Zhong Ji, S. Qi","doi":"10.1109/ICASSP.2003.1201710","DOIUrl":"https://doi.org/10.1109/ICASSP.2003.1201710","url":null,"abstract":"In the clinical analysis and processing for EEG, because of the difference of ages and pathology, it is possible for abnormal waves to appear, related with pathology. Also, the restraint of normal rhythms could be abnormal. But at present doctors estimate if a certain rhythm is restrained only by eye or by some simple analysis methods in clinical EEG detection, which will inevitably lead to some errors and are not observable. By \"the virtual EEG record and analysis instrument\" introduced in this paper, all kinds of characteristic waveforms (e.g. epileptic wave and spikes wave etc.) can be detected and analyzed in time-frequency domain. From the view of clinical application, the concept of band relative intensity ratio (BRIR) is introduced with time-frequency domain analysis, by the use of which we can obtain the relative intensity of all basic rhythms in a certain time period, and this is believed to provide a good assisting analysis method.","PeriodicalId":104473,"journal":{"name":"2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).","volume":"251 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132285116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hardware oriented rate control algorithm and implementation for realtime video coding","authors":"Hung-Chi Fang, Tu-Chih Wang, Yu-Wei Chang, Liang-Gee Chen","doi":"10.1109/ICASSP.2003.1202410","DOIUrl":"https://doi.org/10.1109/ICASSP.2003.1202410","url":null,"abstract":"In this paper, a novel rate control algorithm suitable for real-time video encoding is proposed. The proposed algorithm uses mean absolute error (MAE) results of motion estimation (ME) to achieve bit-rate control. Neither pre-analysis nor multi-pass encoding is required in our algorithm, which makes real-time hardware implementation possible. A new hardware oriented scene change detection method is also included in this rate control framework to achieve better video quality. Experiments show our rate control algorithm behaves well in all situations. The hardware architecture for this algorithm is also described. Implementation shows our proposed algorithm can be efficiently integrated into a low cost, high efficiency video encoder.","PeriodicalId":104473,"journal":{"name":"2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134089144","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Audiovisual-based adaptive speaker identification","authors":"Y. Li, Shrikanth S. Narayanan, C.-C. Jay Kuo","doi":"10.1109/ICASSP.2003.1200095","DOIUrl":"https://doi.org/10.1109/ICASSP.2003.1200095","url":null,"abstract":"An adaptive speaker identification system is presented in this paper, which aims to recognize speakers in feature films by exploiting both audio and visual cues. Specifically, the audio source is first analyzed to identify speakers using a likelihood-based approach. Meanwhile, the visual source is parsed to recognize talking faces using face detection/recognition and mouth tracking techniques. These two information sources are then integrated under a probabilistic framework for improved system performance. Moreover, to account for speakers' voice variations along time, we update their acoustic models on the fly by adapting to their newly contributed speech data. An average of 80% identification accuracy has been achieved on two test movies. This shows a promising future for the proposed audiovisual-based adaptive speaker identification approach.","PeriodicalId":104473,"journal":{"name":"2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121304027","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Tracking multiple maneuvering point targets using multiple filter bank in infrared image sequence","authors":"M. Zaveri, U. Desai, S. Merchant","doi":"10.1109/ICASSP.2003.1202386","DOIUrl":"https://doi.org/10.1109/ICASSP.2003.1202386","url":null,"abstract":"Performance of any tracking algorithm depends upon the model selected to capture the target dynamics. In real world applications, no a priori knowledge about the target motion is available. Moreover, it could be a maneuvering target. The proposed method is able to track maneuvering or nonmaneuvering multiple point targets with large motion (/spl plusmn/20 pixels) using multiple filter bank in an IR image sequence in the presence of clutter and occlusion due to clouds. The use of multiple filters is not new, but the novel idea here is that it uses single-step decision logic to switch over between filters. Our approach does not use any a priori knowledge about maneuver parameters, nor does it exploit a parameterized nonlinear model for the target trajectories. This is in contrast to: (i) interacting multiple model (IMM) filtering which required the maneuver parameters, and (ii) extended Kalman filter (EKF) or unscented Kalman filter (UKF), both of which require a parameterized model for the trajectories. We compared our approach for target tracking with IMM filtering using EKF and UKF for nonlinear trajectory models. UKF uses the nonlinearity of the target model, where as a first order linearization is used in case of the EKF. RMS for the predicted position error (RMS-PPE) obtained using our proposed methodology is significantly less in case of highly maneuvering target.","PeriodicalId":104473,"journal":{"name":"2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121929389","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Robust digit recognition using phase-dependent time-frequency masking","authors":"Guangji Shi, P. Aarabi","doi":"10.1109/ICASSP.2003.1198873","DOIUrl":"https://doi.org/10.1109/ICASSP.2003.1198873","url":null,"abstract":"A technique using the time-frequency phase information of two microphones is proposed to estimate an ideal time-frequency mask using time-delay-of-arrival (TDOA) of the signal of interest. At a signal-to-noise ratio (SNR) of 0 dB, the proposed technique using two microphones achieves a digit recognition rate (average over 5 speakers, each speaking 20-30 digits) of 71%. In contrast, delay-and-sum beamforming only achieves a 40% recognition rate with two microphones and 60% with four microphones. Superdirective beamforming achieves a 44% recognition rate with two microphones and 65% with four microphones.","PeriodicalId":104473,"journal":{"name":"2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116616732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hsuan-Huei Shih, Shrikanth S. Narayanan, C.-C. Jay Kuo
{"title":"Multidimensional humming transcription using a statistical approach for query by humming systems","authors":"Hsuan-Huei Shih, Shrikanth S. Narayanan, C.-C. Jay Kuo","doi":"10.1109/ICASSP.2003.1200026","DOIUrl":"https://doi.org/10.1109/ICASSP.2003.1200026","url":null,"abstract":"A new statistical pattern recognition approach applied to human humming transcription is proposed. A musical note has two important attributes, i.e. pitch and duration. The proposed algorithm generates multidimensional humming transcriptions, which contain both pitch and duration information. Query by humming provides a natural means for content-based retrieval from music databases, and this research provides a robust frontend for such an application. The segment of a note in the humming waveform is modeled by a hidden Markov model (HMM), while the pitch of the note is modeled by a pitch model using a Gaussian mixture model. Preliminary real-time recognition experiments are carried out with models trained by data obtained from eight human subjects, and an overall correct recognition rate of around 80% is demonstrated.","PeriodicalId":104473,"journal":{"name":"2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).","volume":"185 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123243557","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}