2012 IEEE International Conference on Multimedia and Expo最新文献

筛选
英文 中文
Efficient Tag Mining via Mixture Modeling for Real-Time Search-Based Image Annotation 基于混合建模的高效标签挖掘基于实时搜索的图像标注
2012 IEEE International Conference on Multimedia and Expo Pub Date : 2012-07-09 DOI: 10.1109/ICME.2012.104
Lican Dai, Xin-Jing Wang, Lei Zhang, Nenghai Yu
{"title":"Efficient Tag Mining via Mixture Modeling for Real-Time Search-Based Image Annotation","authors":"Lican Dai, Xin-Jing Wang, Lei Zhang, Nenghai Yu","doi":"10.1109/ICME.2012.104","DOIUrl":"https://doi.org/10.1109/ICME.2012.104","url":null,"abstract":"Although it has been extensively studied for many years, automatic image annotation is still a challenging problem. Recently, data-driven approaches have demonstrated their great success to image auto-annotation. Such approaches leverage abundant partially annotated web images to annotate an uncaptioned image. Specifically, they first retrieve a group of visually closely similar images given an uncaptioned image as a query, then figure out meaningful phrases from the surrounding texts of the image search results. Since the surrounding texts are generally noisy, how to effectively mine meaningful phrases is crucial for the success of such approaches. We propose a mixture modeling approach which assumes that a tag is generated from a convex combination of topics. Different from a typical topic modeling approach like LDA, topics in our approach are explicitly learnt from a definitive catalog of the Web, i.e. the Open Directory Project (ODP). Compared with previous works, it has two advantages: Firstly, it uses an open vocabulary rather than a limited one defined by a training set. Secondly, it is efficient for real-time annotation. Experimental results conducted on two billion web images show the efficiency and effectiveness of the proposed approach.","PeriodicalId":273567,"journal":{"name":"2012 IEEE International Conference on Multimedia and Expo","volume":"119 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127638159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Evaluating Gaussian Like Image Representations over Local Features 评估局部特征上的类高斯图像表示
2012 IEEE International Conference on Multimedia and Expo Pub Date : 2012-07-09 DOI: 10.1109/ICME.2012.23
Yu-Chuan Su, Guan-Long Wu, Tzu-Hsuan Chiu, Winston H. Hsu, Kuo-Wei Chang
{"title":"Evaluating Gaussian Like Image Representations over Local Features","authors":"Yu-Chuan Su, Guan-Long Wu, Tzu-Hsuan Chiu, Winston H. Hsu, Kuo-Wei Chang","doi":"10.1109/ICME.2012.23","DOIUrl":"https://doi.org/10.1109/ICME.2012.23","url":null,"abstract":"Recently, several Gaussian like image representations are proposed as an alternative of the bag-of-word representation over local features. These representations are proposed to overcome the quantization error problem faced in bag-of-word representation. They are shown to be effective in different applications, the Extended Hierarchical Gaussianization reached excellent performance using single feature in VOC2009, Vector of Locally Aggregated Descriptors and Fisher Kernel reached excellent performance using only signature like representation on Holiday dataset. Despite their success and similarity, no comparative study about these representations has been made. In this paper, we perform a systematic comparison about three emerging different gaussian like representations: Extended Hierarchical Gaussianization, Fisher Kernel and Vector of Locally Aggregated Descriptors. We evaluate the performance and the influence of feature and parameters of these representations on Holiday and CC_Web_Video datasets, and several important properties about these representations have been observed during our investigation. This study provides better understanding about these gaussian like image representations that are believed to be promising in various applications.","PeriodicalId":273567,"journal":{"name":"2012 IEEE International Conference on Multimedia and Expo","volume":"82 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126264007","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Learning Semantic Motion Patterns for Dynamic Scenes by Improved Sparse Topical Coding 基于改进稀疏局部编码的动态场景语义运动模式学习
2012 IEEE International Conference on Multimedia and Expo Pub Date : 2012-07-09 DOI: 10.1109/ICME.2012.133
Wei Fu, Jinqiao Wang, Zechao Li, Hanqing Lu, Songde Ma
{"title":"Learning Semantic Motion Patterns for Dynamic Scenes by Improved Sparse Topical Coding","authors":"Wei Fu, Jinqiao Wang, Zechao Li, Hanqing Lu, Songde Ma","doi":"10.1109/ICME.2012.133","DOIUrl":"https://doi.org/10.1109/ICME.2012.133","url":null,"abstract":"With the proliferation of cameras in public areas, it becomes increasingly desirable to develop fully automated surveillance and monitoring systems. In this paper, we propose a novel unsupervised approach to automatically explore motion patterns occurring in dynamic scenes under an improved sparse topical coding (STC) framework. Given an input video with a fixed camera, we first segment the whole video into a sequence of clips (documents) without overlapping. Optical flow features are extracted from each pair of consecutive frames, and quantized into discrete visual words. Then the video is represented by a word-document hierarchical topic model through a generative process. Finally, an improved sparse topical coding approach is proposed for model learning. The semantic motion patterns (latent topics) are learned automatically and each video clip is represented as a weighted summation of these patterns with only a few nonzero coefficients. The proposed approach is purely data-driven and scene independent (not an object-class specific), which make it suitable for very large range of scenarios. Experiments demonstrate that our approach outperforms the state-of-the art technologies in dynamic scene analysis.","PeriodicalId":273567,"journal":{"name":"2012 IEEE International Conference on Multimedia and Expo","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130137220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Predicting Image Popularity in an Incomplete Social Media Community by a Weighted Bi-partite Graph 基于加权二部图的不完全社交媒体社区图像人气预测
2012 IEEE International Conference on Multimedia and Expo Pub Date : 2012-07-09 DOI: 10.1109/ICME.2012.43
Xiang Niu, Lusong Li, Tao Mei, Jialie Shen, Ke Xu
{"title":"Predicting Image Popularity in an Incomplete Social Media Community by a Weighted Bi-partite Graph","authors":"Xiang Niu, Lusong Li, Tao Mei, Jialie Shen, Ke Xu","doi":"10.1109/ICME.2012.43","DOIUrl":"https://doi.org/10.1109/ICME.2012.43","url":null,"abstract":"Popularity prediction is a key problem in networks to analyze the information diffusion, especially in social media communities. Recently, there have been some custom-build prediction models in Digg and YouTube. However, these models are hardly transplant to an incomplete social network site (e.g., Flickr) by their unique parameters. In addition, because of the large scale of the network in Flickr, it is difficult to get all of the photos and the whole network. Thus, we are seeking for a method which can be used in such incomplete network. Inspired by a collaborative filtering method-Network-based Inference (NBI), we devise a weighted bipartite graph with undetected users and items to represent the resource allocation process in an incomplete network. Instead of image analysis, we propose a modified interdisciplinary models, called Incomplete Network-based Inference (INI). Using the data from 30 months in Flickr, we show the proposed INI is able to increase prediction accuracy by over 58.1%, compared with traditional NBI. We apply our proposed INI approach to personalized advertising application and show that it is more attractive than traditional Flickr advertising.","PeriodicalId":273567,"journal":{"name":"2012 IEEE International Conference on Multimedia and Expo","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134052487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Fast Near-Duplicate Video Retrieval via Motion Time Series Matching 基于运动时间序列匹配的快速近重复视频检索
2012 IEEE International Conference on Multimedia and Expo Pub Date : 2012-07-09 DOI: 10.1109/ICME.2012.111
John R. Zhang, J. Ren, Fangzhe Chang, Thomas L. Wood, J. Kender
{"title":"Fast Near-Duplicate Video Retrieval via Motion Time Series Matching","authors":"John R. Zhang, J. Ren, Fangzhe Chang, Thomas L. Wood, J. Kender","doi":"10.1109/ICME.2012.111","DOIUrl":"https://doi.org/10.1109/ICME.2012.111","url":null,"abstract":"This paper introduces a method for the efficient comparison and retrieval of near duplicates of a query video from a video database. The method generates video signatures from histograms of orientations of optical flow of feature points computed from uniformly sampled video frames concatenated over time to produce time series, which are then aligned and matched. Major incline matching, a data reduction and peak alignment method for time series, is adapted for faster performance. The resultant method is compact and robust against a number of common transformations including: flipping, cropping, picture-in-picture, photometric, addition of noise and other artifacts. We evaluate on the MUSCLE VCD 2007 dataset and a dataset derived from TRECVID 2009. Good precision (average 88.8%) at significantly higher speeds (average durations: 45 seconds for signature generation plus 92 seconds for a linear search of 81-second query video in a 300 hour dataset) than results reported in the literature are shown.","PeriodicalId":273567,"journal":{"name":"2012 IEEE International Conference on Multimedia and Expo","volume":"108 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133555628","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Principal Components Analysis-Based Edge-Directed Image Interpolation 基于主成分分析的边缘图像插值
2012 IEEE International Conference on Multimedia and Expo Pub Date : 2012-07-09 DOI: 10.1109/ICME.2012.153
Bing Yang, Zhiyong Gao, Xiaoyun Zhang
{"title":"Principal Components Analysis-Based Edge-Directed Image Interpolation","authors":"Bing Yang, Zhiyong Gao, Xiaoyun Zhang","doi":"10.1109/ICME.2012.153","DOIUrl":"https://doi.org/10.1109/ICME.2012.153","url":null,"abstract":"This paper presents an edge-directed, noniterative image interpolation algorithm. In the proposed algorithm, the gradient directions are explicitly estimated with a statistical-based approach. The local dominant gradient directions are obtained by using principal components analysis (PCA) on the four nearest gradients. The angles of the whole gradient plane are divided into four parts, and each gradient direction falls into one part. Then we implement the interpolation with one-dimention (1-D) cubic convolution interpolation perpendicular to the gradient direction. Compared to the state of-the-art interpolation methods, simulation results show that the proposed PCA-based edge-directed interpolation method preserves edges well while maintaining a high PSNR value.","PeriodicalId":273567,"journal":{"name":"2012 IEEE International Conference on Multimedia and Expo","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115549765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Error Modeling and Estimation Fusion for Indoor Localization 室内定位误差建模与估计融合
2012 IEEE International Conference on Multimedia and Expo Pub Date : 2012-07-09 DOI: 10.1109/ICME.2012.106
Weipeng Zhuo, Bo Zhang, S. Chan, E. Chang
{"title":"Error Modeling and Estimation Fusion for Indoor Localization","authors":"Weipeng Zhuo, Bo Zhang, S. Chan, E. Chang","doi":"10.1109/ICME.2012.106","DOIUrl":"https://doi.org/10.1109/ICME.2012.106","url":null,"abstract":"There has been much interest in offering multimedia location-based service (LBS) to indoor users (e.g., sending video/audio streams according to user locations). Offering good LBS largely depends on accurate indoor localization of mobile stations (MSs). To achieve that, in this paper we first model and analyze the error characteristics of important indoor localization schemes, using Radio Frequency Identification (RFID) and Wi-Fi. Our models are simple to use, capturing important system parameters and measurement noises, and quantifying how they affect the accuracies of the localization. Given that there have been many indoor localization techniques deployed, an MS may receive simultaneously multiple co-existing estimations on its location. Equipped with the understanding of location errors, we then investigate how to optimally combine, or fuse, all the co-existing estimations of an MS's location. We present computationally-efficient closed-form expressions to fuse the outputs of the estimators. Simulation and experimental results show that our fusion technique achieves higher location accuracy in spite of location errors in the estimators.","PeriodicalId":273567,"journal":{"name":"2012 IEEE International Conference on Multimedia and Expo","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114446645","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Traffic Reduction for Multiple Users in Multi-view Video Streaming 多视频流中多用户的流量减少
2012 IEEE International Conference on Multimedia and Expo Pub Date : 2012-07-09 DOI: 10.1109/ICME.2012.185
T. Fujihashi, Ziyuan Pan, Takashi Watanabe
{"title":"Traffic Reduction for Multiple Users in Multi-view Video Streaming","authors":"T. Fujihashi, Ziyuan Pan, Takashi Watanabe","doi":"10.1109/ICME.2012.185","DOIUrl":"https://doi.org/10.1109/ICME.2012.185","url":null,"abstract":"Multi-view video consists of multiple video sequences captured simultaneously from different angles by multiple closely spaced cameras. It enables the users to freely change their viewpoints by playing different video sequences. Transmission of multi-view video requires more bandwidth than conventional multimedia. To reduce the bandwidth, UDMVT (User Dependent Multi-view Video Transmission) based on MVC (Multi-view Video Coding) has been proposed for single user. In UDMVT, for multiple users the same frames are encoded into different versions for each user, which increases the redundant transmission. For this problem, this paper proposes UMSM (User dependent Multi-view video Streaming for Multi-users). UMSM possesses two characteristics. The first characteristic is that the overlapped frames that are required by multiple users are transmitted only once using the multicast to avoid unnecessary duplication of transmission. The second characteristic is that a time lag of the video request by multiple users is adjusted to coincide with the next request. Simulation results using benchmark test sequences provided by MERL show that UMSM decreases the transmission bit-rate 55.3% on average for 5 users watching the same multi-view video as compared with UDMVT.","PeriodicalId":273567,"journal":{"name":"2012 IEEE International Conference on Multimedia and Expo","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114505067","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Energy-Aware Operation of Black Box Surveillance Cameras under Event Uncertainty and Memory Constraint 事件不确定性和内存约束下黑匣子监控摄像机的能量感知运行
2012 IEEE International Conference on Multimedia and Expo Pub Date : 2012-07-09 DOI: 10.1109/ICME.2012.21
Giwon Kim, Jungsoo Kim, Jongpil Jung, C. Kyung
{"title":"Energy-Aware Operation of Black Box Surveillance Cameras under Event Uncertainty and Memory Constraint","authors":"Giwon Kim, Jungsoo Kim, Jongpil Jung, C. Kyung","doi":"10.1109/ICME.2012.21","DOIUrl":"https://doi.org/10.1109/ICME.2012.21","url":null,"abstract":"In this paper, we propose an event-driven black box surveillance camera which reduces energy consumption by waking up the system only when an event is detected and dynamically adjusting the video encoding and the resultant image distortion according to the criticality of captured frames called significance level. To achieve this goal, we find an encoding bitrate minimizing the energy consumption of the camera while satisfying the limited memory space constraint and distortion requirement at each significance level by judiciously allocating bit-rate to each significance level. To do that, we considered the trade-off relations between the total energy consumption vs. encoding bit-rate according to the significance level. For further energy savings, we also proposed a low complexity solution which adjusts the energy-minimal encoding bit-rate based on the dynamically changing event behavior, i.e., timing and duration of events. Experimental results show that the proposed method yields up to 67.49% (49.19% on average) energy savings compared to the conventional bitrate allocation methods.","PeriodicalId":273567,"journal":{"name":"2012 IEEE International Conference on Multimedia and Expo","volume":"114 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114708896","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
A Unified Estimation-Theoretic Framework for Error-Resilient Scalable Video Coding 一种容错可伸缩视频编码的统一估计理论框架
2012 IEEE International Conference on Multimedia and Expo Pub Date : 2012-07-09 DOI: 10.1109/ICME.2012.76
Jingning Han, Vinay Melkote, K. Rose
{"title":"A Unified Estimation-Theoretic Framework for Error-Resilient Scalable Video Coding","authors":"Jingning Han, Vinay Melkote, K. Rose","doi":"10.1109/ICME.2012.76","DOIUrl":"https://doi.org/10.1109/ICME.2012.76","url":null,"abstract":"A novel scalable video coding (SVC) scheme is proposed for video transmission over loss networks, which builds on an estimation-theoretic (ET) framework for optimal prediction and error concealment, given all available information from both the current base layer and prior enhancement layer frames. It incorporates a recursive end-to-end distortion estimation technique, namely, the spectral coefficient-wise optimal recursive estimate (SCORE), which accounts for all ET operations and tracks the first and second moments of decoder reconstructed transform coefficients. The overall framework enables optimization of ET-SVC systems for transmission over lossy networks, while accounting for all relevant conditions including the effects of quantization, channel loss, concealment, and error propagation. It thus resolves longstanding difficulties in combining truly optimal prediction and concealment with optimal end-to-end distortion and error-resilient SVC coding decisions. Experiments demonstrate that the proposed scheme offers substantial performance gains over existing error-resilient SVC systems, under a wide range of packet loss and bit rates.","PeriodicalId":273567,"journal":{"name":"2012 IEEE International Conference on Multimedia and Expo","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116981544","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信