Proceedings of the 19th ACM international conference on Multimedia最新文献_第10页

Colorizing tags in tag cloud: a novel query-by-tag music search system 标签云中的标签着色:一种新颖的按标签查询的音乐搜索系统

Proceedings of the 19th ACM international conference on Multimedia Pub Date : 2011-11-28 DOI: 10.1145/2072298.2072337

Ju-Chiang Wang, Yu-Chin Shih, Meng-Sung Wu, H. Wang, Shyh-Kang Jeng

{"title":"Colorizing tags in tag cloud: a novel query-by-tag music search system","authors":"Ju-Chiang Wang, Yu-Chin Shih, Meng-Sung Wu, H. Wang, Shyh-Kang Jeng","doi":"10.1145/2072298.2072337","DOIUrl":"https://doi.org/10.1145/2072298.2072337","url":null,"abstract":"This paper presents a novel content-based query-by-tag music search system for an untagged music database. We design a new tag query interface that allows users to input multiple tags with multiple levels of preference (denoted as an MTML query) by colorizing desired tags in a web-based tag cloud interface. When a user clicks and holds the left mouse button (or presses and holds his/her finger on a touch screen) on a desired tag, the color of the tag will change cyclically according to a color map (from dark blue to bright red), which represents the level of preference (from 0 to 1). In this way, the user can easily organize and check the query of multiple tags with multiple levels of preference through the colored tags. To effect the MTML content-based music retrieval, we introduce a probabilistic fusion model (denoted as GMFM), which consists of two mixture models, namely a Gaussian mixture model and a multinomial mixture model. GMFM can jointly model the auditory features and tag labels of a song. Two indexing methods and their corresponding matching methods, namely pseudo song-based matching and tag affinity-based matching, are incorporated into the pre-learned GMFM. We evaluate the proposed system on the MajorMiner and CAL-500 datasets. The experimental results demonstrate the effectiveness of GMFM and the potential of using MTML queries to search music from an untagged music database.","PeriodicalId":318758,"journal":{"name":"Proceedings of the 19th ACM international conference on Multimedia","volume":"208 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131546034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 21

Bilinear deep learning for image classification 用于图像分类的双线性深度学习

Proceedings of the 19th ACM international conference on Multimedia Pub Date : 2011-11-28 DOI: 10.1145/2072298.2072505

S. Zhong, Y. Liu, Yang Liu

引用次数: 6

Random partial paired comparison for subjective video quality assessment via hodgerank 基于hodgerank主观视频质量评价的随机偏配对比较

Proceedings of the 19th ACM international conference on Multimedia Pub Date : 2011-11-28 DOI: 10.1145/2072298.2072350

Qianqian Xu, Tingting Jiang, Y. Yao, Qingming Huang, Bowei Yan, Weisi Lin

{"title":"Random partial paired comparison for subjective video quality assessment via hodgerank","authors":"Qianqian Xu, Tingting Jiang, Y. Yao, Qingming Huang, Bowei Yan, Weisi Lin","doi":"10.1145/2072298.2072350","DOIUrl":"https://doi.org/10.1145/2072298.2072350","url":null,"abstract":"Subjective visual quality evaluation provides the groundtruth and source of inspiration in building objective visual quality metrics. Paired comparison is expected to yield more reliable results; however, this is an expensive and timeconsuming process. In this paper, we propose a novel framework of HodgeRank on Random Graphs (HRRG) to achieve efficient and reliable subjective Video Quality Assessment (VQA). To address the challenge of a potentially large number of combinations of videos to be assessed, the proposed methodology does not require the participants to perform the complete comparison of all the paired videos. Instead, participants only need to perform a random sample of all possible paired comparisons, which saves a great amount of time and labor. In contrast to the traditional deterministic incomplete block designs, our random design is not only suitable for traditional laboratory and focus-group studies, but also fit for crowdsourcing experiments on Internet where the raters are distributive over Internet and it is hard to control with precise experimental designs. Our contribution in this work is three-fold: 1) a HRRG framework is proposed to quantify the quality of video; 2) a new random design principle is investigated to conduct paired comparison based on Erdos-Renyi random graph theory; 3) Hodge decomposition is introduced to derive, from incomplete and imbalanced data, quality scores of videos and inconsistency of participants'judgments. We demonstrate the effectiveness of the proposed framework on LIVE Database. Equipped with random graph theory and HodgeRank, our scheme has the following advantages over the traditional ones: 1) data collection is simple and easy to handle, and thus is more suitable for crowdsourcing on Internet; 2) workload on participants is lower and more flexible; 3) the rating procedure is efficient, labor-saving, and more importantly, without jeopardizing the accuracy of the results.","PeriodicalId":318758,"journal":{"name":"Proceedings of the 19th ACM international conference on Multimedia","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130288224","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 44

Tennis real play: an interactive tennis game with models from real videos 网球真人游戏:一个互动的网球游戏与模型从真实的视频

Proceedings of the 19th ACM international conference on Multimedia Pub Date : 2011-11-28 DOI: 10.1145/2072298.2072361

Jui-Hsin Lai, Chieh-Li Chen, Po-Chen Wu, Chieh-Chi Kao, Shao-Yi Chien

{"title":"Tennis real play: an interactive tennis game with models from real videos","authors":"Jui-Hsin Lai, Chieh-Li Chen, Po-Chen Wu, Chieh-Chi Kao, Shao-Yi Chien","doi":"10.1145/2072298.2072361","DOIUrl":"https://doi.org/10.1145/2072298.2072361","url":null,"abstract":"Tennis Real Play (TRP) is an interactive tennis game system constructed with models extracted from videos of real matches. The key techniques proposed for TRP include player modeling and video-based player/court rendering. For player model creation, we propose a database normalization process and a behavioral transition model of tennis players, which might be a good alternative for motion capture in the conventional video games. For player/court rendering, we propose a framework for rendering vivid game characters and providing the real-time ability. We can say that image-based rendering leads to a more interactive and realistic rendering. Experiments show that video games with vivid viewing effects and characteristic players can be generated from match videos without much user intervention. Because the player model can adequately record the ability and condition of a player in the real world, it can then be used to roughly predict the results of real tennis matches in the next days. The results of a user study reveal that subjects like the increased interaction, immersive experience, and enjoyment from playing TRP.","PeriodicalId":318758,"journal":{"name":"Proceedings of the 19th ACM international conference on Multimedia","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129007889","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

SRV-TaGS: An Automatic TAGging and Search System for Sensor-Rich Outdoor Videos SRV-TaGS:一个传感器丰富的户外视频自动标记和搜索系统

Proceedings of the 19th ACM international conference on Multimedia Pub Date : 2011-11-28 DOI: 10.1145/2072298.2072444

Zhijie Shen, Sakire Arslan Ay, S. H. Kim

引用次数: 0

Contextual synonym dictionary for visual object retrieval 上下文同义词字典的视觉对象检索

Proceedings of the 19th ACM international conference on Multimedia Pub Date : 2011-11-28 DOI: 10.1145/2072298.2072364

Wenbin Tang, Rui Cai, Zhiwei Li, Lei Zhang

{"title":"Contextual synonym dictionary for visual object retrieval","authors":"Wenbin Tang, Rui Cai, Zhiwei Li, Lei Zhang","doi":"10.1145/2072298.2072364","DOIUrl":"https://doi.org/10.1145/2072298.2072364","url":null,"abstract":"In this paper, we study the problem of visual object retrieval by introducing a dictionary of contextual synonyms to narrow down the semantic gap in visual word quantization. The basic idea is to expand a visual word in the query image with its synonyms to boost the retrieval recall. Unlike the existing work such as soft-quantization, which only focuses on the Euclidean (l2) distance in descriptor space, we utilize the visual words which are more likely to describe visual objects with the same semantic meaning by identifying the words with similar contextual distributions (i.e. contextual synonyms). We describe the contextual distribution of a visual word using the statistics of both co-occurrence and spatial information averaged over all the image patches having this visual word, and propose an efficient system implementation to construct the contextual synonym dictionary for a large visual vocabulary. The whole construction process is unsupervised and the synonym dictionary can be naturally integrated into a standard bag-of-feature image retrieval system. Experimental results on several benchmark datasets are quite promising. The contextual synonym dictionary-based expansion consistently outperforms the l2 distance-based soft-quantization, and advances the state-of-the-art performance remarkably.","PeriodicalId":318758,"journal":{"name":"Proceedings of the 19th ACM international conference on Multimedia","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127709452","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 21

The FCam API for programmable cameras 用于可编程相机的FCam API

Proceedings of the 19th ACM international conference on Multimedia Pub Date : 2011-11-28 DOI: 10.1145/2072298.2072425

S. H. Park, Andrew Adams, Eino-Ville Talvala

引用次数: 2

Extracting intentionally captured regions using point trajectories 使用点轨迹提取有意捕获的区域

Proceedings of the 19th ACM international conference on Multimedia Pub Date : 2011-11-28 DOI: 10.1145/2072298.2072029

Yuta Nakashima, N. Babaguchi

引用次数: 1

News contextualization with geographic and visual information 新闻语境化与地理和视觉信息

Proceedings of the 19th ACM international conference on Multimedia Pub Date : 2011-11-28 DOI: 10.1145/2072298.2072317

Zechao Li, M. Wang, J. Liu, Changsheng Xu, Hanqing Lu

{"title":"News contextualization with geographic and visual information","authors":"Zechao Li, M. Wang, J. Liu, Changsheng Xu, Hanqing Lu","doi":"10.1145/2072298.2072317","DOIUrl":"https://doi.org/10.1145/2072298.2072317","url":null,"abstract":"In this paper, we investigate the contextualization of news documents with geographic and visual information. We propose a matrix factorization approach to analyze the location relevance for each news document. We also propose a method to enrich the document with a set of web images. For location relevance analysis, we first perform toponym extraction and expansion to obtain a toponym list from news documents. We then propose a matrix factorization method to estimate the location-document relevance scores while simultaneously capturing the correlation of locations and documents. For image enrichment, we propose a method to generate multiple queries from each news document for image search and then employ an intelligent fusion approach to collect a set of images from the search results. Based on the location relevance analysis and image enrichment, we introduce a news browsing system named NewsMap which can support users in reading news via browsing a map and retrieving news with location queries. The news documents with the corresponding enriched images are presented to help users quickly get information. Extensive experiments demonstrate the effectiveness of our approaches.","PeriodicalId":318758,"journal":{"name":"Proceedings of the 19th ACM international conference on Multimedia","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117080441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 29

StoryImaging: a media-rich presentation system for textual stories StoryImaging:文本故事的富媒体呈现系统

Proceedings of the 19th ACM international conference on Multimedia Pub Date : 2011-11-28 DOI: 10.1145/2072298.2072451

Genliang Guan, Zhiyong Wang, Xiansheng Hua, D. Feng

引用次数: 1