Jing Zhang, Linjun Yang, Hong Lu, X. Xue, Yap-Peng Tan
{"title":"Efficient Video Clip Retrieval Using Index Structure","authors":"Jing Zhang, Linjun Yang, Hong Lu, X. Xue, Yap-Peng Tan","doi":"10.1109/MMSP.2005.248688","DOIUrl":"https://doi.org/10.1109/MMSP.2005.248688","url":null,"abstract":"Retrieving similar video clips from large video database requires high query efficiency, precision and recall, which remains a challenging problem since the traditional query algorithms are inefficient and time-consuming. In this paper, we adopt the high-dimensional index structure vector-approximation file (VA-file) to organize the video database, and propose a new similarity measure which takes the temporal order among the video representations into account to improve the accuracy of query. Based on the VA-file and similarity measure, a new video clip retrieval algorithm is proposed in our method to achieve high query efficiency by using restricted sliding window to construct candidate video clips. Experimental results show that the proposed video retrieval method is efficient and effective","PeriodicalId":191719,"journal":{"name":"2005 IEEE 7th Workshop on Multimedia Signal Processing","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132047114","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A New Video Analysis Approach for Coherent Key-frame Extraction and Object Segmentation","authors":"Xiaomu Song, Guoliang Fan","doi":"10.1109/MMSP.2005.248622","DOIUrl":"https://doi.org/10.1109/MMSP.2005.248622","url":null,"abstract":"We discuss a new video analysis approach for coherent key-frame extraction and object segmentation. As two basic units for content-based video analysis, key-frame extraction and object segmentation are usually implemented independently and separately based on different feature sets. Our previous work showed that by exploiting the inherent relationship between key-frames and objects, a set of salient key-frames can be extracted to support robust and efficient object segmentation. This work furthers the previous numerical studies by suggesting a new analytical approach to jointly formulate key-frame extraction and object segmentation via a statistical mixture model where the concept of frame/pixel saliency is introduced. A modified expectation maximization algorithm is developed for model estimation that leads to the most salient key-frames for object segmentation. Simulations on both synthetic and real videos show the effectiveness and efficiency of the proposed method","PeriodicalId":191719,"journal":{"name":"2005 IEEE 7th Workshop on Multimedia Signal Processing","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128249714","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Adaptive Multi-Frame-Rate Scheme for Distributed Speech Recognition Based on a Half Frame-Rate Front-End","authors":"Z. Tan, P. Dalsgaard, B. Lindberg","doi":"10.1109/MMSP.2005.248653","DOIUrl":"https://doi.org/10.1109/MMSP.2005.248653","url":null,"abstract":"In this paper a half frame-rate (HFR) front-end is investigated for distributed speech recognition (DSR). The work is inspired from the need for low bit-rate and is justified by the redundancies known to exist in full frame-rate (FFR) features. At the client-side in the DSR architecture, implementation of the HFR is carried out by using double frame shifting as compared to the FFR resulting in the achievement of half the bit rate. At the server-side, each HFR feature vector is repeated once to construct the FFR features and no changes are therefore required in the recognition back-end. It is experimentally justified that the performance achieved by HFR is comparable to FFR and that repetition of each HFR feature vector is critical for the HFR front-end to maintain the performance. Motivated by the effectiveness of HFR, a number of additional FFR-based DSR schemes are further presented. Finally, this paper introduces an adaptive multi-frame-rate scheme in which the DSR system adapts to the characteristics of the transmission channel by switching between HFR and the FFR-based schemes. This multi-frame-rate scheme is found to be superior to the basic FFR","PeriodicalId":191719,"journal":{"name":"2005 IEEE 7th Workshop on Multimedia Signal Processing","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133679892","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Rate-Distortion Optimization for Internet Video Summarization and Transmission","authors":"P. Pahalawatta, Zhuo Li, F. Zhai, A. Katsaggelos","doi":"10.1109/MMSP.2005.248632","DOIUrl":"https://doi.org/10.1109/MMSP.2005.248632","url":null,"abstract":"The goal of video summarization is to generate a shorter video sequence of a lengthy original sequence using only the key frames of the original sequence. We consider a video summarization scheme that generates a video summary that can be transmitted over an unreliable network such as the Internet with minimum distortion of the original video. We consider two methods of distortion measurement in our optimization scheme, and we apply the methods to a scenario in which feedback is available with the possibility of retransmitting lost packets. Simulation results showing the effectiveness of using the proposed schemes are presented","PeriodicalId":191719,"journal":{"name":"2005 IEEE 7th Workshop on Multimedia Signal Processing","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133957777","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Semantic Quality for Content-Aware Video Adaptation","authors":"T. Thang, Yong Ju Jung, Yong Man Ro","doi":"10.1109/MMSP.2005.248560","DOIUrl":"https://doi.org/10.1109/MMSP.2005.248560","url":null,"abstract":"Recently, the concept of semantic transcoding has been introduced. However, there is little research on semantic quality measure to quantitatively guide and evaluate semantic transcoding. In this paper, we present a framework to formulate the semantic quality of an adapted video compared to the original one. Both original video and adapted video are represented by conceptual graphs, and then the similarities between the graphs are used to compute the overall semantic quality. Moreover, the dependence of quality on context is also discussed","PeriodicalId":191719,"journal":{"name":"2005 IEEE 7th Workshop on Multimedia Signal Processing","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121920645","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A New Approach to Texture Retrieval","authors":"C. Ng, Guojun Lu, Dengsheng Zhang","doi":"10.1109/MMSP.2005.248549","DOIUrl":"https://doi.org/10.1109/MMSP.2005.248549","url":null,"abstract":"The texture retrieval approach based on Gabor filters has been shown to be very useful for texture retrieval and is widely adopted. We examine this approach in detail and question if this is a suitable approach. We then propose an alternative approach that does not use filters. Our experimental results and analysis show that our proposed approach performs better than the commonly used approach based on Gabor filters","PeriodicalId":191719,"journal":{"name":"2005 IEEE 7th Workshop on Multimedia Signal Processing","volume":"293 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125750816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient Feature Extraction for Robust Image Classification and Retrieval","authors":"Zhuo Liu, S. Wada","doi":"10.1109/MMSP.2005.248596","DOIUrl":"https://doi.org/10.1109/MMSP.2005.248596","url":null,"abstract":"In this paper, a new feature extraction method for robust image classification and retrieval is proposed. The robust image classification and retrieval systems are required when the images are not ideal such as geometrically distorted and/or contain additive noise. To construct an efficient feature space, an optimum linear transform is obtained by nonlinear optimization in learning process using a set of image samples. In the simulations, the method is experimentally applied to characterize wavelet packet representation of texture images robust to noise and geometrical (rotation and translation) distortion. Further, it is efficiently used for texture retrieval system to demonstrate the usefulness of the method. It is shown that the higher retrieval rate is achieved compared with the conventional approach such as discriminant analysis","PeriodicalId":191719,"journal":{"name":"2005 IEEE 7th Workshop on Multimedia Signal Processing","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126042104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Piecewise Interpolation Method Based on Log-Least Square Error Criterion for HRTF","authors":"J. Zhang, Zhen-yang Wu","doi":"10.21437/Interspeech.2004-397","DOIUrl":"https://doi.org/10.21437/Interspeech.2004-397","url":null,"abstract":"This paper addresses the problem of accurately realizing the interpolation of spatially discrete head-related transfer function (HRTF) for synthesis of virtual auditory space. By analyzing the advantages and disadvantages of general bilinear interpolation method in 3D-sound, associating with human auditory system's mechanism of band-pass filtering and consulting critical bands of psychoacoustics, the paper presents a piecewise interpolation method based on log-least square error criterion for HRTF's magnitude to compensate the deficiency of bilinear method in intermediate frequency. As seen from the following simulations, this new method accomplishes preferable results","PeriodicalId":191719,"journal":{"name":"2005 IEEE 7th Workshop on Multimedia Signal Processing","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121794139","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On Large Scale Peer-To-Peer Video Streaming: Experiments and Empirical Studies","authors":"Xinyan Zhang, Jiangchuan Liu, Bo Li","doi":"10.1109/MMSP.2005.248644","DOIUrl":"https://doi.org/10.1109/MMSP.2005.248644","url":null,"abstract":"It is an attractive idea to spread real time media using peer-to-peer method, however nobody knows if this technology can really deliver real time media to large number of users in global scale. This paper presents an empirical study on CoolStreaming, a data-driven overlay network for live media streaming. The core operations in CoolStreaming are very simple: every node periodically exchanges data availability information with a set of partners, and retrieves unavailable data from one or more partners, or supplies available data to partners. We have extensively evaluated the performance of CoolStreaming over real Internet environment. Our experiments, involving more than 10,000 real users, demonstrate that CoolStreaming can achieve quite good streaming quality even under global scaled, formidable network conditions. Meanwhile, we present several interesting observations from these large-scale tests. Finally, we try to predict some directions to the future of p2p media streaming","PeriodicalId":191719,"journal":{"name":"2005 IEEE 7th Workshop on Multimedia Signal Processing","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122300680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Effective Algorithm for Breaking F5","authors":"Hong Cai, S. Agaian, Yufeng Wang","doi":"10.1109/MMSP.2005.248568","DOIUrl":"https://doi.org/10.1109/MMSP.2005.248568","url":null,"abstract":"By thwarting visual and chi2 attacks, the F5 steganographic algorithm is viewed as a challenge to steganalysis. This paper presents a novel algorithm that can break F5, even with low embedding rates. The test results show that the proposed method can accurately break F5 when relatively short messages (82 bytes) are embedded into a 256times256 gray image","PeriodicalId":191719,"journal":{"name":"2005 IEEE 7th Workshop on Multimedia Signal Processing","volume":"92 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124647156","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}