{"title":"Layered Clustering for Solar Powered Wireless Visual Sensor Networks","authors":"Xiaoming Fan, W. Shaw, I. Lee","doi":"10.1109/ISM.2007.35","DOIUrl":"https://doi.org/10.1109/ISM.2007.35","url":null,"abstract":"Visual-based wireless sensor networks have been implemented in several different fields such as environment monitoring, military applications, and robotic applications. Due to the limitation of node's specification, the bandwidth and energy become critical issues for sensor nodes. In this paper, we employ a solar cell recharging model and a layered clustering model to deal with the restrict energy consumption under the consideration of visual quality. The system lifetime can be prolonged by rechargeable solar cell that can be recharged by solar panel in daytime. In addition, we analyze the simulation results of energy consumption and total transmitted packets by changing the aggregation rate and gate energy (GE). With the aggregation rate decreasing, the cluster head in inner layer can support more visual nodes and reserve more bandwidth. The lower GE can reduce the packets loss during the system charging process. The analysis and experiment result obtained in this paper prove that with the combination of layered clustering and solar recharging, the performance of wireless visual sensor network can be enhanced under the consideration of the restrict node's capacity and video distortion.","PeriodicalId":129680,"journal":{"name":"Ninth IEEE International Symposium on Multimedia (ISM 2007)","volume":"771 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121181034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Moving Region Detection by Transportation Problem Solving","authors":"T. Yokoyama, S. Furukawa, Toshinori Watanabe","doi":"10.1109/ISM.2007.28","DOIUrl":"https://doi.org/10.1109/ISM.2007.28","url":null,"abstract":"In this paper, we propose a novel moving region detection method from the viewpoint of solving the transportation problem. This method extracts the relations between regions as a solution to the transformation problem between pixels belonging to adjacent frames. Moving regions are detected by utilizing the properties of these relations. This method does not require any models such as prior knowledge or particular assumptions about moving objects or backgrounds in a video. Since the method adaptively detects moving regions from input frame data, it can deal with the fluctuations of moving objects or backgrounds. We demonstrate the effectiveness of the proposed method through several experiments conducted using actual videos.","PeriodicalId":129680,"journal":{"name":"Ninth IEEE International Symposium on Multimedia (ISM 2007)","volume":"230 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123161092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Feature-Based Full-Frame Image Stabilization","authors":"Chih-Yuan Chung, Homer H. Chen","doi":"10.1109/ISM.2007.44","DOIUrl":"https://doi.org/10.1109/ISM.2007.44","url":null,"abstract":"Digital image stabilization usually discards boundary pixels and outputs a smaller video. In this paper, we present a new digital image stabilization algorithm that preserves the frame size of output video by pixel filling. The proposed algorithm eliminates the accumulation error by directly estimating the global motions in a transformation chain with reference to a fixed frame. A feature matching method is adopted to save the computational cost of the global motion estimation and to handle large motions. The experimental results show that the proposed algorithm produces stabilized full-frame video sequences with better frame alignment.","PeriodicalId":129680,"journal":{"name":"Ninth IEEE International Symposium on Multimedia (ISM 2007)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126385358","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Spatial-Temporal Error Detection Scheme for Video Transmission over Noisy Channels","authors":"Guan-Lin Wu, Shao-Yi Chien","doi":"10.1109/ISM.2007.40","DOIUrl":"https://doi.org/10.1109/ISM.2007.40","url":null,"abstract":"Error detection plays an important role in an error- robust video decoder. In this paper, a spatial-temporal error detection scheme for a video decoder is proposed. By considering inherently spatial and temporal similarities in video sequences, the visually corrected macroblocks in the decoded frames are detected by employing a set of error detection procedures, where one cross-boundary similarity index and one cross-frame similarity index are defined for spatial and temporal error detection, respectively. An adaptive threshold scheme is also proposed to make the proposed error detection method suitable for different video sequences. After being integrated with an H.264 decoder with error concealment techniques, the video quality improvement of 0.5-2.4 dB in PSNR is achieved. This method can also be integrated with other video codecs to improve the decoded video quality over noisy channels.","PeriodicalId":129680,"journal":{"name":"Ninth IEEE International Symposium on Multimedia (ISM 2007)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123553134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jun-Ren Ding, Chien-Lin Huang, Ji-Kun Lin, J. Yang, Chung-Hsien Wu
{"title":"Magic Mirror","authors":"Jun-Ren Ding, Chien-Lin Huang, Ji-Kun Lin, J. Yang, Chung-Hsien Wu","doi":"10.1109/ISM.2007.11","DOIUrl":"https://doi.org/10.1109/ISM.2007.11","url":null,"abstract":"This investigation describes a novel design and implementation of an interactive multimedia mirror system, called \"Magic Mirror.\" The Magic Mirror can be easily implemented in existing personal computers or hand-held device with normal peripherals and regular reflective glass by integrating image/speech processing, Internet connectivity, and 3D and multimedia software. The integrated Magic Mirror, which includes speech recognition, speech synthesis, face detection/modified/recognition, 3D virtual genius, hidden LCD mirror, and camera, performs simple syndication to capture information about peripherals and network connections. The user can easily activate personal multimedia services using verbal commands. The Magic Mirror can function like a good friend who listens to the user's questions and automatically responds to these requests, providing relaxation and consolation. Moreover, the Magic Mirror can detect a user's feeling based on speech and image recognition features to select the appropriate music and speech to alter the user's mood.","PeriodicalId":129680,"journal":{"name":"Ninth IEEE International Symposium on Multimedia (ISM 2007)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121143084","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Design of a Multi-party VoIP Conferencing System over the Internet","authors":"B. Sat, Zixia Huang, B. Wah","doi":"10.1109/ISM.2007.48","DOIUrl":"https://doi.org/10.1109/ISM.2007.48","url":null,"abstract":"In this paper, we present the design of a VoIP conferencing system that enables the voice communication of multiple users in the Internet. After studying the conversational dynamics in multi-party conferencing, we identify user-observable metrics that affect the perception of conversational quality and their trade-offs. Based on the dynamics and the behavior on delays, jitters, and losses of Internet traces collected in the PlanetLab, we design the transmission topology and schemes for loss concealments and play-out scheduling. Last, we compare the performance of our system and Skype (version 3.5.0.214) using repeatable experiments that simulate human participants and network conditions in a multi-party conferencing scenario.","PeriodicalId":129680,"journal":{"name":"Ninth IEEE International Symposium on Multimedia (ISM 2007)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115385045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
G. Tam, Qingzheng Zheng, M. Corbyn, Rynson W. H. Lau
{"title":"Motion Retrieval Based on Energy Morphing","authors":"G. Tam, Qingzheng Zheng, M. Corbyn, Rynson W. H. Lau","doi":"10.1109/ISM.2007.15","DOIUrl":"https://doi.org/10.1109/ISM.2007.15","url":null,"abstract":"Matching and retrieval of motion sequences has become an important research area in recent years, due to the increasing availability and popularity of motion capture data. The main challenge in matching two motion sequences is the diversity of the captured motions, including variable length, local shifting, local and global scaling. Most existing methods employ Dynamic Time Warping (DTW) or Uniform Scaling to handle these problems. In this paper, we propose a novel content-based method for matching of this human motion captured data. We convert the matching problem of motion capture data into a transportation problem. To solve this problem efficiently, we employ Earth Mover's Distance (EMD) as the matching framework. To penalize any strayed matching, we provide a ground distance that works similar to Sakoe- Chiba band of DTW. Empirical results obtained are encouraging.","PeriodicalId":129680,"journal":{"name":"Ninth IEEE International Symposium on Multimedia (ISM 2007)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121395971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Implementation and Evaluation of Late Data Choice for TCP in Linux","authors":"E. Birkedal, C. Griwodz, P. Halvorsen","doi":"10.1109/ISM.2007.18","DOIUrl":"https://doi.org/10.1109/ISM.2007.18","url":null,"abstract":"Real-time delivery of time-dependent data over the Internet is challenging. UDP has often been used to transport data in a timely manner, but its lack of congestion control is often criticized. This criticism is a reason that the vast majority of applications today use TCP. The downside of this is that TCP has problems with the timely delivery of data. A transport protocol that adds congestion control to an otherwise UDP-like behaviour is DCCP For this protocol, late data choice (LDC) [8] has been proposed to allow adaptive applications control over data packets up to the actual transmission time. We find, however, that application developers appreciate other TCP features as well, such as its reliability. We have therefore implemented and tested the LDC ideas for TCP. It allows the application to modify or drop packets that have been handed to TCP until they are actually transmitted to the network. This is achieved with a shared packet ring and indexes to hold the current status. Our experiments show that we can send more useful data with LDC than without in a streaming scenario. We can therefore claim that we achieve a better utilization of the throughput, giving us a higher goodput with LDC than without.","PeriodicalId":129680,"journal":{"name":"Ninth IEEE International Symposium on Multimedia (ISM 2007)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131608925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-stream Asynchrony Modeling for Audio-Visual Speech Recognition","authors":"Guoyun Lv, D. Jiang, R. Zhao, Yunshu Hou","doi":"10.1109/ISM.2007.21","DOIUrl":"https://doi.org/10.1109/ISM.2007.21","url":null,"abstract":"In this paper, two multi-stream asynchrony Dynamic Bayesian Network models (MS-ADBN model and MM-ADBN model) are proposed for audio-visual speech recognition (AVSR). The proposed models, with different topology structures, loose the asynchrony of audio and visual streams to word level. For MS-ADBN model, both in audio stream and in visual stream, each word is composed of its corresponding phones, and each phone is associated with observation vector. MM- ADBN model is an augmentation of MS-ADBN model, a level of hidden nodes--state level, is added between the phone level and the observation node level, to describe the dynamic process of phones. Essentially, MS-ADBN model is a word model, while MM-ADBN model is a phone model. Speech recognition experiments are done on a digit audio-visual (A-V) database, as well as on a continuous A-V database. The results demonstrate that the asynchrony description between audio and visual stream is important for AVSR system, and MM-ADBN model has the best performance for the task of continuous A-V speech recognition.","PeriodicalId":129680,"journal":{"name":"Ninth IEEE International Symposium on Multimedia (ISM 2007)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130157824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
D. V. Deursen, S. D. Bruyne, W. V. Lancker, W. D. Neve, D. D. Schrijver, H. Hellwagner, R. Walle
{"title":"MuMiVA: A Multimedia Delivery Platform Using Format-Agnostic, XML-Driven Content Adaptation","authors":"D. V. Deursen, S. D. Bruyne, W. V. Lancker, W. D. Neve, D. D. Schrijver, H. Hellwagner, R. Walle","doi":"10.1109/ISM.2007.13","DOIUrl":"https://doi.org/10.1109/ISM.2007.13","url":null,"abstract":"Due to the increasing heterogeneity in the current multimedia landscape, the delivery of multimedia content has become an important issue today. This heterogeneity is not only reflected by a plethora of different usage environments, but also by the presence of multiple (scalable) coding formats. Therefore, format-independent adaptation engines have to be used within a multimedia delivery platform, which are able to adapt the multimedia content according to a certain usage environment, independent of the underlying coding format of the content. By relying on automatically created textual descriptions of the high-level syntax of binary media resources, a format-independent adaptation engine can be built. MPEG-21 generic bitstream syntax schema (gBS schema) is a tool that is part of the MPEG-21 multimedia framework. It enables the use of generic bitstream syntax descriptions (gBSDs), i.e., textual descriptions in XML, to steer the adaptation of a binary media resource, using format-independent adaptation logic. In this paper, we address the design and performance evaluation of a multimedia delivery platform that relies on gBS schema-driven adaptation engines. Our platform is called MuMiVA; it is a fully integrated, extensible platform for multimedia delivery in heterogeneous usage environments, using streaming technologies. To demonstrate the flexibility of our multimedia delivery platform, we discuss the functioning of two different applications (i.e., exploitation of temporal scalability and shot selection) applied to two different coding formats (i.e., MPEG-4 visual and H.264/AVC).","PeriodicalId":129680,"journal":{"name":"Ninth IEEE International Symposium on Multimedia (ISM 2007)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121330098","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}