Proceedings of the 20th ACM international conference on Multimedia最新文献_第9页

Geometric context-preserving progressive transmission in mobile visual search 移动视觉搜索中保持几何上下文的渐进式传输

Proceedings of the 20th ACM international conference on Multimedia Pub Date : 2012-10-29 DOI: 10.1145/2393347.2396355

J. Xia, Ke Gao, Dongming Zhang, Zhendong Mao

{"title":"Geometric context-preserving progressive transmission in mobile visual search","authors":"J. Xia, Ke Gao, Dongming Zhang, Zhendong Mao","doi":"10.1145/2393347.2396355","DOIUrl":"https://doi.org/10.1145/2393347.2396355","url":null,"abstract":"Progressive transmission is very effective to reduce retrieval latency in mobile visual search. However, the acceleration effects of existing progressive transmission strategies are often limited because of the neglect of geometric information in the query image. This paper proposes an effective and efficient geometric context-preserving progressive transmission method, which is suitable for mobile visual search. Here a query image is divided into blocks and local features in the same block are used as query units rather than a single feature. Since clustered features with geometric information are more discriminative, only a few of them could support correct matching with high precision. Thus our method significantly decreases the number of features needed for transmission, and dramatically reduces the retrieval latency. Experiments on Stanford dataset for mobile visual search show that, with comparable precision, we uses 43% less retrieval time than existing progressive transmission method. Moreover, we establish and release a large-scale image dataset called MVSBench which is more difficult and suitable for mobile visual search. It contains 75500 images and considers many variations like view change, blur, scale, illumination and rotation. MVSBench is another major contribution of this paper, and our method also outperforms other strategies on this dataset.","PeriodicalId":212654,"journal":{"name":"Proceedings of the 20th ACM international conference on Multimedia","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122444282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 11

Multi-view learning from imperfect tagging 不完全标注的多视图学习

Proceedings of the 20th ACM international conference on Multimedia Pub Date : 2012-10-29 DOI: 10.1145/2393347.2393416

Zhongang Qi, Ming Yang, Zhongfei Zhang, Zhengyou Zhang

{"title":"Multi-view learning from imperfect tagging","authors":"Zhongang Qi, Ming Yang, Zhongfei Zhang, Zhengyou Zhang","doi":"10.1145/2393347.2393416","DOIUrl":"https://doi.org/10.1145/2393347.2393416","url":null,"abstract":"In many real-world applications, tagging is imperfect: incomplete, inconsistent, and error-prone. Solutions to this problem will generate societal and technical impacts. In this paper, we investigate this arguably new problem: learning from imperfect tagging. We propose a general and effective learning scheme called the Multi-view Imperfect Tagging Learning (MITL) to this problem. The main idea of MITL lies in extracting the information of the imperfectly tagged training dataset from multiple views to differentiate the data points in the role of classification. Further, a novel discriminative classification method is proposed under the framework of MITL, which explicitly makes use of the given multiple labels simultaneously as an additional feature to deliver a more effective classification performance than the existing literature where one label is considered at a time as the classification target while the rest of the given labels are completely ignored at the same time. The proposed methods can not only complete the incomplete tagging but also denoise the noisy tagging through an inductive learning. We apply the general solution to the problem with a more specific context - imperfect image annotation, and evaluate the proposed methods on a standard dataset from the related literature. Experiments show that they are superior to the peer methods on solving the problem of learning from imperfect tagging in cross-media.","PeriodicalId":212654,"journal":{"name":"Proceedings of the 20th ACM international conference on Multimedia","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120893678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

Improving dense image correspondence estimation with interactive user guidance 改进密集图像对应估计与交互式用户指导

Proceedings of the 20th ACM international conference on Multimedia Pub Date : 2012-10-29 DOI: 10.1145/2393347.2396400

K. Ruhl, Benjamin Hell, F. Klose, C. Lipski, Sören Petersen, M. Magnor

引用次数: 5

MoViMash: online mobile video mashup MoViMash:在线移动视频混搭

Proceedings of the 20th ACM international conference on Multimedia Pub Date : 2012-10-29 DOI: 10.1145/2393347.2393373

M. Saini, Raghudeep Gadde, Shuicheng Yan, Wei Tsang Ooi

{"title":"MoViMash: online mobile video mashup","authors":"M. Saini, Raghudeep Gadde, Shuicheng Yan, Wei Tsang Ooi","doi":"10.1145/2393347.2393373","DOIUrl":"https://doi.org/10.1145/2393347.2393373","url":null,"abstract":"With the proliferation of mobile video cameras, it is becoming easier for users to capture videos of live performances and socially share them with friends and public. As an attendee of such live performances typically has limited mobility, each video camera is able to capture only from a range of restricted viewing angles and distance, producing a rather monotonous video clip. At such performances, however, multiple video clips can be captured by different users, likely from different angles and distances. These videos can be combined to produce a more interesting and representative mashup of the live performances for broadcasting and sharing. The earlier works select video shots merely based on the quality of currently available videos. In real video editing process, however, recent selection history plays an important role in choosing future shots. In this work, we present MoViMash, a framework for automatic online video mashup that makes smooth shot transitions to cover the performance from diverse perspectives. Shot transition and shot length distributions are learned from professionally edited videos. Further, we introduce view quality assessment in the framework to filter out shaky, occluded, and tilted videos. To the best of our knowledge, this is the first attempt to incorporate history-based diversity measurement, state-based video editing rules, and view quality in automated video mashup generations. Experimental results have been provided to demonstrate the effectiveness of MoViMash framework.","PeriodicalId":212654,"journal":{"name":"Proceedings of the 20th ACM international conference on Multimedia","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126922906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 72

Mobile-based advertisement information retrieval from images and websites 基于移动设备的图片和网站广告信息检索

Proceedings of the 20th ACM international conference on Multimedia Pub Date : 2012-10-29 DOI: 10.1145/2393347.2396347

Yi-Feng Pan, Jian Sun, Siyuan Chen, Yuan He, Yingju Xia, Jun Sun, S. Naoi

{"title":"Mobile-based advertisement information retrieval from images and websites","authors":"Yi-Feng Pan, Jian Sun, Siyuan Chen, Yuan He, Yingju Xia, Jun Sun, S. Naoi","doi":"10.1145/2393347.2396347","DOIUrl":"https://doi.org/10.1145/2393347.2396347","url":null,"abstract":"In the real world, there are a huge amount of advertisement (ad) boards to make customers have a visual awareness of the products or services easily. However, information appearing in the ad boards is so limited that customers always want to know more ad details in a convenient way. In this paper, we present an mobile-based prototype system to automatically extract web ad information from images and websites. After capturing ad images by smartphones and sending them to a remote server, ad image text is recognized by OCR engine, from where ad phrases and keywords are extracted and combined together as queries. Ad web page candidates are then obtained by specific search engines and clustered to remove noises. OCR results are further used to estimate valid ad topic web pages which are pushed back to end users for searching more detailed ad information. Based on the experiments on a real-world ad image dataset collected by ourselves, true ad topic web pages can be found from top-one and top-ten returned pages in about 51.85% and 83.33% query images respectively, which illustrates the effectiveness of the proposed system.","PeriodicalId":212654,"journal":{"name":"Proceedings of the 20th ACM international conference on Multimedia","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126598430","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

ACM international workshop on cloud-based multimedia applications and services for e-health(CBMAS-EH 2012) ACM基于云的多媒体应用和服务促进电子保健国际讲习班(CBMAS-EH 2012)

Proceedings of the 20th ACM international conference on Multimedia Pub Date : 2012-10-29 DOI: 10.1145/2393347.2396544

M. S. Hossain, Abdulmotaleb El Saddik

引用次数: 0

Personalized access to cultural heritage: multimedia by the crowd, for the crowd 个性化获取文化遗产:多媒体由人群，为人群

Proceedings of the 20th ACM international conference on Multimedia Pub Date : 2012-10-29 DOI: 10.1145/2393347.2396547

J. Oomen, Lora Aroyo, S. Marchand-Maillet, J. Douglass

引用次数: 1

Parallel deblocking filtering in H.264/AVC using multiple CPUs and GPUs 并行去块滤波在H.264/AVC使用多个cpu和gpu

Proceedings of the 20th ACM international conference on Multimedia Pub Date : 2012-10-29 DOI: 10.1145/2393347.2396370

Bart Pieters, Charles-Frederik Hollemeersch, J. D. Cock, W. D. Neve, P. Lambert, R. Walle

引用次数: 1

Analyzing social media via event facets 通过事件方面分析社交媒体

Proceedings of the 20th ACM international conference on Multimedia Pub Date : 2012-10-29 DOI: 10.1145/2393347.2396484

Zhiyu Wang, Peng Cui, Lexing Xie, Hao Chen, Wenwu Zhu, Shiqiang Yang

引用次数: 17

Session details: Full paper session 4: large scale search 会议详情:全文会议4:大规模搜索

Proceedings of the 20th ACM international conference on Multimedia Pub Date : 2012-10-29 DOI: 10.1145/3246397

C. Ngo

引用次数: 0