Proceedings of the 24th ACM international conference on Multimedia最新文献_第8页

Partial Multi-Modal Sparse Coding via Adaptive Similarity Structure Regularization 基于自适应相似结构正则化的部分多模态稀疏编码

Proceedings of the 24th ACM international conference on Multimedia Pub Date : 2016-10-01 DOI: 10.1145/2964284.2967201

Zhou Zhao, Hanqing Lu, Deng Cai, Xiaofei He, Yueting Zhuang

{"title":"Partial Multi-Modal Sparse Coding via Adaptive Similarity Structure Regularization","authors":"Zhou Zhao, Hanqing Lu, Deng Cai, Xiaofei He, Yueting Zhuang","doi":"10.1145/2964284.2967201","DOIUrl":"https://doi.org/10.1145/2964284.2967201","url":null,"abstract":"Multi-modal sparse coding has played an important role in many multimedia applications, where data are usually with multiple modalities. Recently, various multi-modal sparse coding approaches have been proposed to learn sparse codes of multi-modal data, which assume that data appear in all modalities, or at least there is one modality containing all data. However, in real applications, it is often the case that some modalities of the data may suffer from missing information and thus result in partial multi-modality data. In this paper, we propose to solve the partial multi-modal sparse coding problem via multi-modal similarity structure regularization. Specifically, we propose a partial multi-modal sparse coding framework termed Adaptive Partial Multi-Modal Similarity Structure Regularization for Sparse Coding (AdaPM2SC), which preserves the similarity structure within the same modality and between different modalities. Experimental results conducted on two real-world datasets demonstrate that AdaPM2SC significantly outperforms the state-of-the-art methods under partial multi-modality scenario.","PeriodicalId":140670,"journal":{"name":"Proceedings of the 24th ACM international conference on Multimedia","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134311238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 12

Semantic Description of Timbral Transformations in Music Production 音乐制作中音色变换的语义描述

Proceedings of the 24th ACM international conference on Multimedia Pub Date : 2016-10-01 DOI: 10.1145/2964284.2967238

R. Stables, B. D. Man, Sean Enderby, J. Reiss, György Fazekas, Thomas Wilmering

引用次数: 26

Weakly-Supervised Recognition, Localization, and Explanation of Visual Entities 视觉实体的弱监督识别、定位与解释

Proceedings of the 24th ACM international conference on Multimedia Pub Date : 2016-10-01 DOI: 10.1145/2964284.2971479

P. Mettes

引用次数: 1

Multi-Modal Learning: Study on A Large-Scale Micro-Video Data Collection 多模态学习:大规模微视频数据采集研究

Proceedings of the 24th ACM international conference on Multimedia Pub Date : 2016-10-01 DOI: 10.1145/2964284.2971477

Jingyuan Chen

引用次数: 15

ThePlantGame: Actively Training Human Annotators for Domain-specific Crowdsourcing ThePlantGame:积极训练特定领域众包的人类注释者

Proceedings of the 24th ACM international conference on Multimedia Pub Date : 2016-10-01 DOI: 10.1145/2964284.2973820

Maximilien Servajean, A. Joly, D. Shasha, Julien Champ, Esther Pacitti

引用次数: 6

Performance Measurements of Virtual Reality Systems: Quantifying the Timing and Positioning Accuracy 虚拟现实系统的性能测量:量化定时和定位精度

Proceedings of the 24th ACM international conference on Multimedia Pub Date : 2016-10-01 DOI: 10.1145/2964284.2967303

Chun-Ming Chang, Cheng-Hsin Hsu, Chih-Fan Hsu, Kuan-Ta Chen

引用次数: 38

Scene Image Synthesis from Natural Sentences Using Hierarchical Syntactic Analysis 基于层次句法分析的自然句子场景图像合成

Proceedings of the 24th ACM international conference on Multimedia Pub Date : 2016-10-01 DOI: 10.1145/2964284.2967193

Tetsuaki Mano, Hiroaki Yamane, T. Harada

引用次数: 1

Query Adaptive Instance Search using Object Sketches 使用对象草图查询自适应实例搜索

Proceedings of the 24th ACM international conference on Multimedia Pub Date : 2016-10-01 DOI: 10.1145/2964284.2964317

S. Bhattacharjee, Junsong Yuan, Weixiang Hong, Xiang Ruan

{"title":"Query Adaptive Instance Search using Object Sketches","authors":"S. Bhattacharjee, Junsong Yuan, Weixiang Hong, Xiang Ruan","doi":"10.1145/2964284.2964317","DOIUrl":"https://doi.org/10.1145/2964284.2964317","url":null,"abstract":"Sketch-based object search is a challenging problem mainly due to two difficulties: (1) how to match the binary sketch query with the colorful image, and (2) how to locate the small object in a big image with the sketch query. To address the above challenges, we propose to leverage object proposals for object search and localization. However, instead of purely relying on sketch features, e.g., Sketch-a-Net, to locate the candidate object proposals, we propose to fully utilize the appearance information to resolve the ambiguities among object proposals and refine the search results. Our proposed query adaptive search is formulated as a sub-graph selection problem, which can be solved by maximum flow algorithm. By performing query expansion using a smaller set of more salient matches as the query representatives, it can accurately locate the small target objects in cluttered background or densely drawn deformation intensive cartoon (Manga like) images. Our query adaptive sketch based object search on benchmark datasets exhibits superior performance when compared with existing methods, which validates the advantages of utilizing both the shape and appearance features for sketch-based search.","PeriodicalId":140670,"journal":{"name":"Proceedings of the 24th ACM international conference on Multimedia","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121052958","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 24

Joint Graph Learning and Video Segmentation via Multiple Cues and Topology Calibration 基于多线索和拓扑校准的联合图学习和视频分割

Proceedings of the 24th ACM international conference on Multimedia Pub Date : 2016-10-01 DOI: 10.1145/2964284.2964295

Jingkuan Song, Lianli Gao, M. Puscas, F. Nie, Fumin Shen, N. Sebe

{"title":"Joint Graph Learning and Video Segmentation via Multiple Cues and Topology Calibration","authors":"Jingkuan Song, Lianli Gao, M. Puscas, F. Nie, Fumin Shen, N. Sebe","doi":"10.1145/2964284.2964295","DOIUrl":"https://doi.org/10.1145/2964284.2964295","url":null,"abstract":"Video segmentation has become an important and active research area with a large diversity of proposed approaches. Graph-based methods, enabling top performance on recent benchmarks, usually focus on either obtaining a precise similarity graph or designing efficient graph cutting strategies. However, these two components are often conducted in two separated steps, and thus the obtained similarity graph may not be the optimal one for segmentation and this may lead to suboptimal results. In this paper, we propose a novel framework, joint graph learning and video segmentation (JGLVS)}, which learns the similarity graph and video segmentation simultaneously. JGLVS learns the similarity graph by assigning adaptive neighbors for each vertex based on multiple cues (appearance, motion, boundary and spatial information). Meanwhile, the new rank constraint is imposed to the Laplacian matrix of the similarity graph, such that the connected components in the resulted similarity graph are exactly equal to the number of segmentations. Furthermore, JGLVS can automatically weigh multiple cues and calibrate the pairwise distance of superpixels based on their topology structures. Most noticeably, empirical results on the challenging dataset VSB100 show that JGLVS achieves promising performance on the benchmark dataset which outperforms the state-of-the-art by up to 11% for the BPR metric.","PeriodicalId":140670,"journal":{"name":"Proceedings of the 24th ACM international conference on Multimedia","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129192433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 24

Improving Speaker Diarization of TV Series using Talking-Face Detection and Clustering 用说话脸检测和聚类改进电视连续剧的说话人特征

Proceedings of the 24th ACM international conference on Multimedia Pub Date : 2016-10-01 DOI: 10.1145/2964284.2967202

H. Bredin, G. Gelly

引用次数: 33