{"title":"Fixed point error analysis of CORDIC processor based on the variance propagation","authors":"S. Park, N. Cho","doi":"10.1109/ICME.2003.1221746","DOIUrl":"https://doi.org/10.1109/ICME.2003.1221746","url":null,"abstract":"The effects of angle approximation and founding in the CORDIC processor have been intensively studied for the determination of design parameters. However, the conventional analyses provide only the error bound which results in large discrepancy between the analysis and the actual implementation. Moreover, some of the signal processing architectures require the specification in terms of the mean squared error (MSE) as in the design specification of FFT processor for OFDM. This paper proposes a fixed point MSE analysis based on the variance propagation for more accurate error expression of CORDIC processor. It is shown that the proposed analysis can also be applied to the modified CORDIC algorithms. As an example of application, an FFT processor for OFDM using the CORDIC processor is presented. The results show close match between the analysis and simulation.","PeriodicalId":118560,"journal":{"name":"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)","volume":"162 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132379833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Capacity analysis for parallel and sequential MIMO equalizers","authors":"Xinying Zhang, S. Kung","doi":"10.1109/ICME.2003.1221689","DOIUrl":"https://doi.org/10.1109/ICME.2003.1221689","url":null,"abstract":"It is well known that linear MMSE can outperform its zero-forcing counterpart. In combination with a successive interference canceller, MMSE can fully exploit the capacity of MIMO (multiple-input-multiple-output) channels [A.J. Viterbi, 1986, M.K. Varanasi, T. Guess, 1997]. In practice, however, such an advantage is compromised due to its implementation complexity and the requirement of accurate SNR estimate. Thus other equalizers such as zero-forcing may present an attractive alternative as long as the performance gap is tolerable. This motivates a need to quantify the tradeoff between MMSE and zero-forcing in both parallel and sequential structures. In this paper, the capacity performance of different equalization schemes is investigated, with closed-form formulas provided in terms of two key measures: capacity gaps and ratios. We also conclude that the capacity gain via structural choice (between parallel and sequential) far out-weights that via filter choice (between zero-forcing and MMSE). Indeed, the latter is found to be almost negligible for most practical SNR regions. It is also shown that the sequential zero-forcing equalizers can asymptotically reach the channel capacity when SNR approaches infinity, irrelevant of the detection order. Although this paper is focused on the flat-fading channels, the result is directly extendable to the ISI case by slicing the frequency band into infinitesimal stripes, each of which can be treated as flat.","PeriodicalId":118560,"journal":{"name":"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132109013","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Automatic speaker recognition using dynamic Bayesian network","authors":"Lifeng Sang, Zhaohui Wu, Yingchun Yang, Wanfeng Zhang","doi":"10.1109/ICME.2003.1221386","DOIUrl":"https://doi.org/10.1109/ICME.2003.1221386","url":null,"abstract":"This paper presents a novel approach to automatic speaker recognition using dynamic Bayesian network (DBN). DBNs have a precise and well-understand probabilistic semantics, and it has the ability to incorporate prior knowledge, to represent arbitrary non-linearities, and to handle hidden variables and missing data in a principled way with high extensibility. Experimental evaluation over YOHO corpus shows promising results compared to other classical methods.","PeriodicalId":118560,"journal":{"name":"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115401127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Learning cross-modal appearance models with application to tracking","authors":"John W. Fisher III, Trevor Darrell","doi":"10.1109/ICME.2003.1221541","DOIUrl":"https://doi.org/10.1109/ICME.2003.1221541","url":null,"abstract":"Objects of interest are rarely silent or invisible. Analysis of multi-modal signal generation from a single object represents a rich and challenging area for smart sensor arrays. We consider the problem of simultaneously learning and audio and visual appearance model of a moving subject. We present a method which successfully learns such a model without benefit of hand initialization using only the associated audio signal to \"decide\" which object to model and track. We are interested in particular in modeling joint audio and video variation, such as produced by a speaking face. We present an algorithm and experimental results of a human speaker moving in a scene.","PeriodicalId":118560,"journal":{"name":"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124469333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Incorporating real-valued multiple instance learning into relevance feedback for image retrieval","authors":"Xin Huang, Shu‐Ching Chen, M. Shyu","doi":"10.1109/ICME.2003.1220919","DOIUrl":"https://doi.org/10.1109/ICME.2003.1220919","url":null,"abstract":"This paper presents a content-based image retrieval (CBIR) system that incorporates real-valued multiple instance learning (MIL) into the user relevance feedback (RF) to learn the user's subjective visual concepts, especially where the user's most interested region and how to map the local feature vector of that region to the high-level concept pattern of the user. RF provides a way to obtain the subjectivity of the user's high-level visual concepts, and MIL enables the automatic learning of the user's high-level concepts. The user interacts with the CBIR system by relevance feedback in a way that the extent to which the image samples retrieved by the system are relevant to the user's intention is labeled. The system in turn applies the MIL method to find user's most interested image region from the feedback. A multilayer neural network that is trained progressively through the feedback and learning procedure is used to map the low-level image features to the high-level concepts.","PeriodicalId":118560,"journal":{"name":"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)","volume":"97 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124742399","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Real time eye tracking for human computer interfaces","authors":"A. Subramanya, Raghunandan S. Kumaran, J. Gowdy","doi":"10.1109/ICME.2003.1221372","DOIUrl":"https://doi.org/10.1109/ICME.2003.1221372","url":null,"abstract":"In recent years considerable interest has developed in real time eye tracing for various applications. An approach that has received a lot of attention is the use of infrared technology for purposes of eye tracking. In this paper, we propose a technique that does not rely on the use of infrared devices for eye tracking. Instead, our eye tracker makes use of a binary classifier with a dynamic training strategy and an unsupervised clustering stage in order to efficiently track the pupil (eyeball) in real time. The dynamic training strategy makes the algorithm subject (speaker) and lighting condition invariant. Our algorithm does not make any assumption regarding the position of the speaker's face in the field of view of the camera, nor does it restrict the 'natural' motion of the speaker in the field of view of the camera. Experimental results from a real time implementation show that this algorithm is robust and able to detect the pupils under various illumination conditions.","PeriodicalId":118560,"journal":{"name":"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125817681","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Phil Spencer Whitehead, David V. Anderson, M. Clements
{"title":"Adaptive, acoustic noise suppression for speech enhancement","authors":"Phil Spencer Whitehead, David V. Anderson, M. Clements","doi":"10.1109/ICME.2003.1220980","DOIUrl":"https://doi.org/10.1109/ICME.2003.1220980","url":null,"abstract":"Removal of ambient noise from a single-channel audio signal is becoming an increasingly important problem due to the proliferation of portable communication devices. Furthermore, in applications such as wireless telephony and phonetic data mining, it is desired that noise suppression be robust to changing noise conditions and that processing take place in real time or faster. This paper proposes an adaptive noise suppression system that mitigates or eliminates processing artifacts common to Wiener filtering without decreasing speech recognition performance. Results of one implementation of such a structure demonstrate significant improvements in both perceptual quality and speech recognition performance under noisy conditions.","PeriodicalId":118560,"journal":{"name":"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125927279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Emotion representation for online gaming","authors":"A. Raouzaiou, K. Karpouzis, S. Kollias","doi":"10.1109/ICME.2003.1220943","DOIUrl":"https://doi.org/10.1109/ICME.2003.1220943","url":null,"abstract":"The ability to simulate lifelike interactive characters has many applications in the gaming industry. Human faces may act as visual interfaces that help users feel at home when interacting with a computer because they are accepted as the most expressive means for communicating and recognizing emotions. Thus, a lifelike human face can enhance interactive applications by providing straightforward feedback to and from the users and stimulating emotional responses from them. Thus, the gaming and entertainment industries can benefit from employing believable, expressive characters since such features significantly enhance the atmosphere of a virtual world and communicate messages far more vividly than any textual or speech information. In this paper, we present an abstract means of description of facial expressions, by utilizing concepts included in the MPEG-4 standard. Furthermore, we exploit these concepts to synthesize a wide variety of expressions using a reduced representation, suitable for networked and lightweight applications.","PeriodicalId":118560,"journal":{"name":"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)","volume":"79 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115559635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hybrid video downloading/streaming over peer-to-peer networks","authors":"Yufeng Shan, S. Kalyanaraman","doi":"10.1109/ICME.2003.1221704","DOIUrl":"https://doi.org/10.1109/ICME.2003.1221704","url":null,"abstract":"Peer-to-peer based multimedia delivery is becoming increasingly more important in today's networks. Using a peer- to-peer network to assist video streaming is a topic of considerable interest. In this paper, we propose a novel hybrid video downloading/streaming scheme (HDS) that efficiently integrates traditional client/server based video streaming and peer-to-peer based media distribution. Furthermore, we propose a receiver-driven algorithm to coordinate the downloading and streaming modes; and control the state transitions between these modes. We have performed real-world experiments and simulations to validate our concept. These results show that our proposed scheme greatly increases the availability of video content on the receiver side and simultaneously reduces the server load significantly.","PeriodicalId":118560,"journal":{"name":"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116005320","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"State-space RLS","authors":"M. Malik","doi":"10.1109/ICME.2003.1221048","DOIUrl":"https://doi.org/10.1109/ICME.2003.1221048","url":null,"abstract":"Kalman filter is linear optimal estimator for random signals. We develop state-space RLS that is counterpart of Kalman filter for deterministic signals i.e. there is no process noise but only observation noise. State-space RLS inherits its optimality properties from the standard least squares. It gives excellent tracking performance as compared to existing forms of RLS. A large class of signals can be modeled as outputs of neutrally stable unforced linear systems. State-space RLS is particularly well suited to estimate such signals. The paper commences with batch processing the observations, which is later extended to recursive algorithms. Comparison and equivalence of Kalman filter and state-space RLS become evident during the development of the theory. State-space RLS is expected to become an important tool in estimation theory and adaptive filtering.","PeriodicalId":118560,"journal":{"name":"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116412526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}