Proceedings of the 21st ACM international conference on Multimedia最新文献

筛选
英文 中文
GLocal structural feature selection with sparsity for multimedia data understanding 基于稀疏度的多媒体数据局部结构特征选择
Proceedings of the 21st ACM international conference on Multimedia Pub Date : 2013-10-21 DOI: 10.1145/2502081.2502142
Yan Yan, Zhongwen Xu, Gaowen Liu, Zhigang Ma, N. Sebe
{"title":"GLocal structural feature selection with sparsity for multimedia data understanding","authors":"Yan Yan, Zhongwen Xu, Gaowen Liu, Zhigang Ma, N. Sebe","doi":"10.1145/2502081.2502142","DOIUrl":"https://doi.org/10.1145/2502081.2502142","url":null,"abstract":"The selection of discriminative features is an important and effective technique for many multimedia tasks. Using irrelevant features in classification or clustering tasks could deteriorate the performance. Thus, designing efficient feature selection algorithms to remove the irrelevant features is a possible way to improve the classification or clustering performance. With the successful usage of sparse models in image and video classification and understanding, imposing structural sparsity in emph{feature selection} has been widely investigated during the past years. Motivated by the merit of sparse models, we propose a novel feature selection method using a sparse model in this paper. Different from the state of the art, our method is built upon $ell _{2,p}$-norm and simultaneously considers both the global and local (GLocal) structures of data distribution. Our method is more flexible in selecting the discriminating features as it is able to control the degree of sparseness. Moreover, considering both global and local structures of data distribution makes our feature selection process more effective. An efficient algorithm is proposed to solve the $ell_{2,p}$-norm sparsity optimization problem in this paper. Experimental results performed on real-world image and video datasets show the effectiveness of our feature selection method compared to several state-of-the-art methods.","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89615655","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
An efficient image homomorphic encryption scheme with small ciphertext expansion 一种具有小密文扩展的高效图像同态加密方案
Proceedings of the 21st ACM international conference on Multimedia Pub Date : 2013-10-21 DOI: 10.1145/2502081.2502105
Peijia Zheng, Jiwu Huang
{"title":"An efficient image homomorphic encryption scheme with small ciphertext expansion","authors":"Peijia Zheng, Jiwu Huang","doi":"10.1145/2502081.2502105","DOIUrl":"https://doi.org/10.1145/2502081.2502105","url":null,"abstract":"The field of image processing in the encrypted domain has been given increasing attention for the extensive potential applications, for example, providing efficient and secure solutions for privacy-preserving applications in untrusted environment. One obstacle to the widespread use of these techniques is the ciphertext expansion of high orders of magnitude caused by the existing homomorphic encryptions. In this paper, we provide a way to tackle this issue for image processing in the encrypted domain. By using characteristics of image format, we develop an image encryption scheme to limit ciphertext expansion while preserving the homomorphic property. The proposed encryption scheme first encrypts image pixels with an existing probabilistic homomorphic cryptosystem, and then compresses the whole encrypted image in order to save storage space. Our scheme has a much smaller ciphertext expansion factor compared with the element-wise encryption scheme, while preserving the homomorphic property. It is not necessary to require additional interactive protocols when applying secure signal processing tools to the compressed encrypted image. We present a fast algorithm for the encryption and the compression of the proposed image encryption scheme, which speeds up the computation and makes our scheme much more efficient. The analysis on the security, ciphertext expansion ratio, and computational complexity are also conducted. Our experiments demonstrate the validity of the proposed algorithms. The proposed scheme is suitable to be employed as an image encryption method for the applications in secure image processing.","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91457979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 39
Real-time salient object detection 实时显著目标检测
Proceedings of the 21st ACM international conference on Multimedia Pub Date : 2013-10-21 DOI: 10.1145/2502081.2502240
Chia-Ju Lu, Chih-Fan Hsu, Mei-Chen Yeh
{"title":"Real-time salient object detection","authors":"Chia-Ju Lu, Chih-Fan Hsu, Mei-Chen Yeh","doi":"10.1145/2502081.2502240","DOIUrl":"https://doi.org/10.1145/2502081.2502240","url":null,"abstract":"Salient object detection techniques have a variety of multimedia applications of broad interest. However, the detection must be fast to truly aid in these processes. There exist many robust algorithms tackling the salient object detection problem but most of them are computationally demanding. In this demonstration we show a fast salient object detection system implemented in a conventional PC environment. We examine the challenges faced in the design and development of a practical system that can achieve accurate detection in real-time.","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80694623","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Classifying tag relevance with relevant positive and negative examples 用相关的正反例对标签相关性进行分类
Proceedings of the 21st ACM international conference on Multimedia Pub Date : 2013-10-21 DOI: 10.1145/2502081.2502129
Xirong Li, Cees G. M. Snoek
{"title":"Classifying tag relevance with relevant positive and negative examples","authors":"Xirong Li, Cees G. M. Snoek","doi":"10.1145/2502081.2502129","DOIUrl":"https://doi.org/10.1145/2502081.2502129","url":null,"abstract":"Image tag relevance estimation aims to automatically determine what people label about images is factually present in the pictorial content. Different from previous works, which either use only positive examples of a given tag or use positive and random negative examples, we argue the importance of relevant positive and relevant negative examples for tag relevance estimation. We propose a system that selects positive and negative examples, deemed most relevant with respect to the given tag from crowd-annotated images. While applying models for many tags could be cumbersome, our system trains efficient ensembles of Support Vector Machines per tag, enabling fast classification. Experiments on two benchmark sets show that the proposed system compares favorably against five present day methods. Given extracted visual features, for each image our system can process up to 3,787 tags per second. The new system is both effective and efficient for tag relevance estimation.","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80336799","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 40
Facilitating fashion camouflage art 促进时尚迷彩艺术
Proceedings of the 21st ACM international conference on Multimedia Pub Date : 2013-10-21 DOI: 10.1145/2502081.2502121
Ranran Feng, B. Prabhakaran
{"title":"Facilitating fashion camouflage art","authors":"Ranran Feng, B. Prabhakaran","doi":"10.1145/2502081.2502121","DOIUrl":"https://doi.org/10.1145/2502081.2502121","url":null,"abstract":"Artists and fashion designers have recently been creating a new form of art -- Camouflage Art -- which can be used to prevent computer vision algorithms from detecting faces. This digital art technique combines makeup and hair styling, or other modifications such as facial painting to help avoid automatic face-detection. In this paper, we first study the camouflage interference and its effectiveness on several current state of art techniques in face detection/recognition; and then present a tool that can facilitate digital art design for such camouflage that can fool these computer vision algorithms. This tool can find the prominent or decisive features from facial images that constitute the face being recognized; and give suggestions for camouflage options (makeup, styling, paints) on particular facial features or facial parts. Testing of this tool shows that it can effectively aid the artists or designers in creating camouflage-thwarting designs. The evaluation on suggested camouflages applied on 40 celebrities across eight different face recognition systems (both non-commercial or commercial) shows that 82.5% ~ 100% of times the subject is unrecognizable using the suggested camouflage.","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78169275","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Session details: Best paper session 会议细节:最佳论文会议
Proceedings of the 21st ACM international conference on Multimedia Pub Date : 2013-10-21 DOI: 10.1145/3245285
R. Zimmerman
{"title":"Session details: Best paper session","authors":"R. Zimmerman","doi":"10.1145/3245285","DOIUrl":"https://doi.org/10.1145/3245285","url":null,"abstract":"","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78615638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spatio-temporal fisher vector coding for surveillance event detection 监测事件检测的时空fisher矢量编码
Proceedings of the 21st ACM international conference on Multimedia Pub Date : 2013-10-21 DOI: 10.1145/2502081.2502155
Qiang Chen, Yang Cai, L. Brown, A. Datta, Quanfu Fan, R. Feris, Shuicheng Yan, Alexander Hauptmann, Sharath Pankanti
{"title":"Spatio-temporal fisher vector coding for surveillance event detection","authors":"Qiang Chen, Yang Cai, L. Brown, A. Datta, Quanfu Fan, R. Feris, Shuicheng Yan, Alexander Hauptmann, Sharath Pankanti","doi":"10.1145/2502081.2502155","DOIUrl":"https://doi.org/10.1145/2502081.2502155","url":null,"abstract":"We present a generic event detection system evaluated in the Surveillance Event Detection (SED) task of TRECVID 2012. We investigate a statistical approach with spatio-temporal features applied to seven event classes, which were defined by the SED task. This approach is based on local spatio-temporal descriptors, called MoSIFT and generated by pair-wise video frames. A Gaussian Mixture Model(GMM) is learned to model the distribution of the low level features. Then for each sliding window, the Fisher vector encoding [improvedFV] is used to generate the sample representation. The model is learnt using a Linear SVM for each event. The main novelty of our system is the introduction of Fisher vector encoding into video event detection. Fisher vector encoding has demonstrated great success in image classification. The key idea is to model the low level visual features as a Gaussian Mixture Model and to generate an intermediate vector representation for bag of features. FV encoding uses higher order statistics in place of histograms in the standard BoW. FV has several good properties: (a) it can naturally separate the video specific information from the noisy local features and (b) we can use a linear model for this representation. We build an efficient implementation for FV encoding which can attain a 10 times speed-up over real-time. We also take advantage of non-trivial object localization techniques to feed into the video event detection, e.g. multi-scale detection and non-maximum suppression. This approach outperformed the results of all other teams submissions in TRECVID SED 2012 on four of the seven event types.","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76513765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Object co-segmentation via discriminative low rank matrix recovery 基于判别低秩矩阵恢复的目标共分割
Proceedings of the 21st ACM international conference on Multimedia Pub Date : 2013-10-21 DOI: 10.1145/2502081.2502195
Yong Li, J. Liu, Zechao Li, Yang Liu, Hanqing Lu
{"title":"Object co-segmentation via discriminative low rank matrix recovery","authors":"Yong Li, J. Liu, Zechao Li, Yang Liu, Hanqing Lu","doi":"10.1145/2502081.2502195","DOIUrl":"https://doi.org/10.1145/2502081.2502195","url":null,"abstract":"The goal of this paper is to simultaneously segment the object regions appearing in a set of images of the same object class, known as object co-segmentation. Different from typical methods, simply assuming that the regions common among images are the object regions, we additionally consider the disturbance from consistent backgrounds, and indicate not only common regions but salient ones among images to be the object regions. To this end, we propose a Discriminative Low Rank matrix Recovery (DLRR) algorithm to divide the over-completely segmented regions (i.e.,superpixels) of a given image set into object and non-object ones. In DLRR, a low-rank matrix recovery term is adopted to detect salient regions in an image, while a discriminative learning term is used to distinguish the object regions from all the super-pixels. An additional regularized term is imported to jointly measure the disagreement between the predicted saliency and the objectiveness probability corresponding to each super-pixel of the image set. For the unified learning problem by connecting the above three terms, we design an efficient optimization procedure based on block-coordinate descent. Extensive experiments are conducted on two public datasets, i.e., MSRC and iCoseg, and the comparisons with some state-of-the-arts demonstrate the effectiveness of our work.","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78654334","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Multimedia framed 多媒体框架
Proceedings of the 21st ACM international conference on Multimedia Pub Date : 2013-10-21 DOI: 10.1145/2502081.2512088
E. Churchill
{"title":"Multimedia framed","authors":"E. Churchill","doi":"10.1145/2502081.2512088","DOIUrl":"https://doi.org/10.1145/2502081.2512088","url":null,"abstract":"Multimedia is the combination of several media forms, More typically, the word implies sound and full-motion video. While multimedia technologists concern themselves with the production and distribution of the multimedia artifacts themselves, information designers, educationalists and artists are more concerned with the reception of the artifact, and consider multimedia to be another representational format for multimodal information presentation. Such a perspective leads to questions such as: Is text, or audio or video, or a combination of all three, the best format for the message? Should another modality (e.g., haptics/touch, olfaction) be invoked instead or in addition? How does the setting affect perception/reception? Is the artifact interactive? Is it changed by audience members? Understanding how an artifact is perceived, received and interacted with is central to understanding what multimedia is, opening up possibilities and issuing technical challenges as we imagine new forms and formats of multimedia experience. In this talk, I will illustrate how content understanding is modulated by context, by the “framing” of the content. I will discuss audience participatory production of multimedia and multimodal experiences. I will conclude with some technical excitements, design/development challenges and experiential possibilities that lie ahead.","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76916829","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Scalable training with approximate incremental laplacian eigenmaps and PCA 基于近似增量拉普拉斯特征映射和PCA的可扩展训练
Proceedings of the 21st ACM international conference on Multimedia Pub Date : 2013-10-21 DOI: 10.1145/2502081.2508124
Eleni Mantziou, S. Papadopoulos, Y. Kompatsiaris
{"title":"Scalable training with approximate incremental laplacian eigenmaps and PCA","authors":"Eleni Mantziou, S. Papadopoulos, Y. Kompatsiaris","doi":"10.1145/2502081.2508124","DOIUrl":"https://doi.org/10.1145/2502081.2508124","url":null,"abstract":"The paper describes the approach, the experimental settings, and the results obtained by the proposed methodology at the ACM Yahoo! Multimedia Grand Challenge. Its main contribution is the use of fast and efficient features with a highly scalable semi-supervised learning approach, the Approximate Laplacian Eigenmaps (ALEs), and its extension, by computing the test set incrementally for learning concepts in time linear to the number of images (both labelled and unlabelled). A combination of two local visual features combined with the VLAD feature aggregation method and PCA is used to improve the efficiency and time complexity. Our methodology achieves somewhat better accuracy compared to the baseline (linear SVM) in small training sets, but improves the performance as the training data increase. Performing ALE fusion on a training set of 50K/concept resulted in a MiAP score of 0.4223, which was among the highest scores of the proposed approach.","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77080075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信