{"title":"Maximum likelihood method for blind identification of multiple autoregressive channels","authors":"Zheng Fang, Y. Hua","doi":"10.1109/ICME.2003.1221685","DOIUrl":"https://doi.org/10.1109/ICME.2003.1221685","url":null,"abstract":"We present a two-step maximum likelihood (TSML) algorithm for blind identification of single-input-multiple-output (SIMO) channels modeled as autoregressive (AR) system. The AR-TSML algorithm provides a new and useful alternative to a previously developed TSML algorithm for moving-average (MA) system. The AR-TSML algorithm is shown to be more robust than the MA-TSML algorithm if the channel impulse responses have long tails.","PeriodicalId":118560,"journal":{"name":"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125760636","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A delay-efficient rerouting scheme for VoIP traffic","authors":"N. Kamat, Ju Wang, Jonathan C. L. Liu","doi":"10.1109/ICME.2003.1221294","DOIUrl":"https://doi.org/10.1109/ICME.2003.1221294","url":null,"abstract":"Routing the packet flows through the network is a very important function and if done efficiently, it can contribute appreciably towards ensuring the desired levels of voice over IP (VoIP) quality. Current limitations do not attempt to reconfigure the path of the flows in the case of loss of quality. We propose a dynamic rerouting scheme called BPAR scheme for voice traffic, in which the paths of packet voice flows are dynamically reconfigured, based on network probing and rerouting. We performed a series of emulations to evaluate the performance of the BPAR scheme. For GSM-compressed traffic, the dynamic reconfiguration proposed in this study shows an improvement of 12% reduction in average delay, and also a better connection acceptance (16% high) than other static methods under conditions of network congestion.","PeriodicalId":118560,"journal":{"name":"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)","volume":"41 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125829304","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Super resolution recovery for multi-camera surveillance imaging","authors":"Gulcin Caner, A. Tekalp, W. Heinzelman","doi":"10.1109/ICME.2003.1220866","DOIUrl":"https://doi.org/10.1109/ICME.2003.1220866","url":null,"abstract":"In many surveillance video applications, it is of interest to recognize an object or a person, which occupies a small portion of a low-resolution, noisy video. This paper addresses the problem of super-resolution recovery of a region of interest from more than one low-resolution view of a scene recorded by multiple cameras. The multiple camera scenario alleviates the difficulty in registration of multiple frames of video that contain non-rigid or multiple object motion in the single camera case. With proper temporal registration of multiple videos, arbitrary scene motion can be handled. The success of super-resolution recovery from multiple views in real applications vitally depends on two factors: i) the accuracy of multiple view registration results, and ii) the accuracy of the camera and data acquisition model. We propose a system, which consists of a method for sub-pixel accurate spatio-temporal alignment of multiple video sequences for view registration and the projections onto convex sets method for super-resolution recovery. Experiments were implemented using two commercial analog video cameras, which do not perform on-board compression. Experimental results show that the super resolution recovery of dynamic scenes can be achieved as long as the multiple views of the scene can be registered with sub-pixel accuracy.","PeriodicalId":118560,"journal":{"name":"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125905950","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A new cluster-based distributed video recorder server","authors":"Xiaofei Liao, Hai Jin","doi":"10.1109/ICME.2003.1221295","DOIUrl":"https://doi.org/10.1109/ICME.2003.1221295","url":null,"abstract":"In this paper, we propose and implement a new scalable video server with distributed recording services using intelligent network attached storage based on cluster architectures. For this recording server, we propose a new distributed recording protocol. The recorder can also provide video-on-demand streaming services. With this facility, clients can access media data being recorded randomly, which are distributed on all selected storage nodes of the cluster. Compared with other recording servers, the system has better load balance performance and clients have better interactivities.","PeriodicalId":118560,"journal":{"name":"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124889730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Gandetto, L. Marchesotti, S. Sciutto, D. Negroni, C. Regazzoni
{"title":"From multi-sensor surveillance towards smart interactive spaces","authors":"M. Gandetto, L. Marchesotti, S. Sciutto, D. Negroni, C. Regazzoni","doi":"10.1109/ICME.2003.1220999","DOIUrl":"https://doi.org/10.1109/ICME.2003.1220999","url":null,"abstract":"This paper proposes a novel architecture for multisensor data fusion in the context of ambient intelligence (Ami). The proposed system integrates a heterogeneous network of sensors with CCD cameras and computational units working together in a LAN. Activities of humans interacting in the monitored area are detected and classified by combining sensors data output with a neural method.","PeriodicalId":118560,"journal":{"name":"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)","volume":"230 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129684661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Highlight sound effects detection in audio stream","authors":"Rui Cai, Lie Lu, HongJiang Zhang, Lianhong Cai","doi":"10.1109/ICME.2003.1221242","DOIUrl":"https://doi.org/10.1109/ICME.2003.1221242","url":null,"abstract":"This paper addresses the problem of highlight sound effects detection in audio stream, which is very useful in fields of video summarization and highlight extraction. Unlike researches on audio segmentation and classification, in this domain, it just locates those highlight sound effects in audio stream. An extensible framework is proposed and in current system three sound effects are considered: laughter, applause and cheer, which are tied up with highlight events in entertainments, sports, meetings and home videos. HMMs are used to model these sound effects and a log-likelihood scores based method is used to make final decision. A sound effect attention model is also proposed to extend general audio attention model for highlight extraction and video summarization. Evaluations on a 2-hours audio database showed very encouraging results.","PeriodicalId":118560,"journal":{"name":"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129856444","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Low-complexity video compression for wireless sensor networks","authors":"E. Magli, Massimo Mancin, L. Merello","doi":"10.1109/ICME.2003.1221379","DOIUrl":"https://doi.org/10.1109/ICME.2003.1221379","url":null,"abstract":"We study the problem of compression of videosurveillance sequences collected by a wireless sensor network. In particular, we propose a low-complexity coding framework based on change detection and JPEG-like compression of regions of interest, along with a suitable low-complexity change detection algorithm. We show that on typical videosurveillance sequences the performance of the proposed compression algorithm is similar to that of MPEG-2, at a much less computational cost. Energy profiling results on a TMS320VC5204 board validate the coder design for the proposed application.","PeriodicalId":118560,"journal":{"name":"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127205896","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A fast search algorithm for background music signals based on the search for numerous small signal components","authors":"H. Nagano, K. Kashino, H. Murase","doi":"10.1109/ICME.2003.1220880","DOIUrl":"https://doi.org/10.1109/ICME.2003.1220880","url":null,"abstract":"This paper proposes a method for detecting and locating a known music signal in a long audio stream. Unlike existing methods, ours assumes that the music is used as background music (BGM) and overlapped by another sound such as speech and that the interfering sound is typically louder than the target music. The proposed method is based on time-series active search, which is a quick signal search method reported earlier. To realize the BGM search, however, a novel extension is introduced. That is, the music signal is firstly decomposed into a number of small time-frequency regions, and the search is carried out for each of those components. The results of the search are then integrated based on a voting scheme to find the target music locations. Experiments show that accurate search is possible when SNR is -5 dB and that the search completes in about 8 s for a 30-m stored signal.","PeriodicalId":118560,"journal":{"name":"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127336848","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhanced access to digital video through visually rich interfaces","authors":"Michael G. Christel, Chang Huang","doi":"10.1109/ICME.2003.1221238","DOIUrl":"https://doi.org/10.1109/ICME.2003.1221238","url":null,"abstract":"An image-rich interface is presented, which emphasizes visual exploration of sets of images representing shots returned from a query of filter against a digital video corpus. This interface, a storyboard of keyframes for multiple video segments, maintains temporal layout, accommodates contextual cues and filtering, supports additional filtering through visual features, and provides a means of drilling down to synchronized points in the associated video. These features allow for effective information retrieval from a video collection, as evidenced by the success achieved in interactive query for the TREC 2002 Video Retrieval Track (TREC-V). This paper introduces TREC-V, discusses the design of the multi-segment storyboard interface, illustrates its use with respect to the TREC-V topics, and presents results and conclusions based on the TREC-V evaluation.","PeriodicalId":118560,"journal":{"name":"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129964313","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Resource management for scalably encoded information: the case of image transmission over wireless networks","authors":"V. Rodriguez","doi":"10.1109/ICME.2003.1221042","DOIUrl":"https://doi.org/10.1109/ICME.2003.1221042","url":null,"abstract":"Scalable encoded information, as in the JPEG 2000 standard, results in files, which can be truncated at an arbitrary point and decoded. This work introduces a tractable, yet flexible model appropriate for resource management involving scalably encoded information. At its core is a function yielding a measure of \"quality\" of the decoded information as a function of the number of bits chosen for decoding. It is assumed that all that is known about this function is that its graph yields an \"S-curve\". An energy-efficient policy for the transmission over a wireless network of scalably- encoded images is sought. Two variables are jointly optimized: transmission power, and the number of bits of each file to be decoded (\"coding rate\"). The single-user case is fully analyzed, and a closed-form solution given, which can be clearly identified, graphically. The analysis indicates that both variables can be \"decoupled\", and their optimal values found independently of each other.","PeriodicalId":118560,"journal":{"name":"2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129163244","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}