Proceedings of the 20th ACM international conference on Multimedia最新文献

筛选
英文 中文
Session details: Full paper session 15: image content analysis 会议详情:全论文会议15:图像内容分析
Proceedings of the 20th ACM international conference on Multimedia Pub Date : 2012-10-29 DOI: 10.1145/3246408
Winston H. Hsu
{"title":"Session details: Full paper session 15: image content analysis","authors":"Winston H. Hsu","doi":"10.1145/3246408","DOIUrl":"https://doi.org/10.1145/3246408","url":null,"abstract":"","PeriodicalId":212654,"journal":{"name":"Proceedings of the 20th ACM international conference on Multimedia","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117137262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Modalities consensus for multi-modal constraint propagation 多模态约束传播的模态一致性
Proceedings of the 20th ACM international conference on Multimedia Pub Date : 2012-10-29 DOI: 10.1145/2393347.2396309
Zhenyong Fu, Hongtao Lu, H. Ip, Zhiwu Lu
{"title":"Modalities consensus for multi-modal constraint propagation","authors":"Zhenyong Fu, Hongtao Lu, H. Ip, Zhiwu Lu","doi":"10.1145/2393347.2396309","DOIUrl":"https://doi.org/10.1145/2393347.2396309","url":null,"abstract":"This paper presents a novel modalities consensus framework for multi-modal pairwise constraint propagation (MCP). We first combine multiple single-modal constraint propagation (SCP) problems together, and then explicitly introduce a new modalities consensus regularizer to force the propagation results on different modalities to be consistent with each other. With a separable consensus regularizer, the proposed approach can be effectively solved using an alternating optimization way. More importantly, based on our modalities consensus framework, two single-modal constraint propagation algorithms can be directly reformulated as two well-defined multi-modal solutions. Experimental results on constrained clustering tasks have shown that the proposed framework can achieve significant improvements with respect to the state of the arts.","PeriodicalId":212654,"journal":{"name":"Proceedings of the 20th ACM international conference on Multimedia","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128486765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
A tool for automatic cinemagraphs 一个自动制作电影的工具
Proceedings of the 20th ACM international conference on Multimedia Pub Date : 2012-10-29 DOI: 10.1145/2393347.2396431
Mei-Chen Yeh, Po-Yi Li
{"title":"A tool for automatic cinemagraphs","authors":"Mei-Chen Yeh, Po-Yi Li","doi":"10.1145/2393347.2396431","DOIUrl":"https://doi.org/10.1145/2393347.2396431","url":null,"abstract":"A cinemagraph is a new type of medium that infuses a static image with the dynamics of one or a few particular regions. It is in many ways intermediate between a photograph and a video, and provides a simple, yet expressive way to mix static and dynamic elements from a video clip. However, the process of creating cinemagraphs is usually tedious for end users and requires serious photo editing skills. In this demonstration we show a tool that creates cinemagraphs in a fully automatic manner. The technique should enable new features such as \"intelligent cinemagraph mode\" for digital cameras that provides an alternative method to capture the moment.","PeriodicalId":212654,"journal":{"name":"Proceedings of the 20th ACM international conference on Multimedia","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130028102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
A multimedia analytics framework for browsing image collections in digital forensics 数字取证中用于浏览图像集合的多媒体分析框架
Proceedings of the 20th ACM international conference on Multimedia Pub Date : 2012-10-29 DOI: 10.1145/2393347.2393392
M. Worring, Andreas Engl, Camelia Smeria
{"title":"A multimedia analytics framework for browsing image collections in digital forensics","authors":"M. Worring, Andreas Engl, Camelia Smeria","doi":"10.1145/2393347.2393392","DOIUrl":"https://doi.org/10.1145/2393347.2393392","url":null,"abstract":"Searching through large collections of images to find patterns of use or to find sets of relevant items is difficult, especially when the information to consider is not only the content of the images itself, but also the associated metadata. Multimedia analytics is a new approach to such problems. We consider the case of forensic experts facing image collections of growing size during digital forensic investigations. We answer the forensic challenge by developing specialised novel interactive visualisations which employ content-based image clusters in both the analysis as well as in all visualizations. Their synergy makes the task of manually browsing these collections more effective and efficient. Evaluation of such multimedia analytics is a notoriously hard problem as there are so many factors influencing the result. As a controlled evaluation, we developed a user simulation framework to create image collections with time and directory information as metadata. We apply it in a number of scenarios to illustrate its use. The simulation tool is available to other researchers via our website.","PeriodicalId":212654,"journal":{"name":"Proceedings of the 20th ACM international conference on Multimedia","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128929039","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Search web images using objects, backgrounds and conditions 使用对象、背景和条件搜索网络图像
Proceedings of the 20th ACM international conference on Multimedia Pub Date : 2012-10-29 DOI: 10.1145/2393347.2396350
Jiemi Zhang, Chenxia Wu, Deng Cai
{"title":"Search web images using objects, backgrounds and conditions","authors":"Jiemi Zhang, Chenxia Wu, Deng Cai","doi":"10.1145/2393347.2396350","DOIUrl":"https://doi.org/10.1145/2393347.2396350","url":null,"abstract":"As the volumes of web images have grown rapidly in the last decade, Content-Based Image Retrieval (CBIR) has attracted substantial interests as an effective tool to manage the images. Most existing CBIR systems focus on the object in the image, while ignoring the conditions (day/night, sunny/rain, etc) and the backgrounds, both of which are very helpful to meet the user's information need. To overcome this shortcoming, in this paper, we present a novel CBIR system depending on a novel query formulation considering three aspects: Object, Background and Condition. Specifically, we design a user-friendly interface to help the user formulate a query. The interface can allow the user to give the percentage, relative position and size of each object in the background. Moreover, a corresponding effective ranking method is proposed to return the desirable search results. Experimental results demonstrate that our proposed system improves the searching performance and the user experience compared with the existing searching systems.","PeriodicalId":212654,"journal":{"name":"Proceedings of the 20th ACM international conference on Multimedia","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126955194","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bilingual analysis of song lyrics and audio words 歌词和音频词的双语分析
Proceedings of the 20th ACM international conference on Multimedia Pub Date : 2012-10-29 DOI: 10.1145/2393347.2396323
Jen-Yu Liu, Chin-Chia Michael Yeh, Yi-Hsuan Yang, Yuan-Ching Teng
{"title":"Bilingual analysis of song lyrics and audio words","authors":"Jen-Yu Liu, Chin-Chia Michael Yeh, Yi-Hsuan Yang, Yuan-Ching Teng","doi":"10.1145/2393347.2396323","DOIUrl":"https://doi.org/10.1145/2393347.2396323","url":null,"abstract":"Thanks to the development of music audio analysis, state-of-the-art techniques can now detect musical attributes such as timbre, rhythm, and pitch with certain level of reliability and effectiveness. An emerging body of research has begun to model the high-level perceptual properties of music listening, including the mood and the preferable listening context of a music piece. Towards this goal, we propose a novel text-like feature representation that encodes the rich and time-varying information of music using a composite of features extracted from the song lyrics and audio signals. In particular, we investigate dictionary learning algorithms to optimize the generation of local feature descriptors and also probabilistic topic models to group semantically relevant text and audio words. This text-like representation leads to significant improvement in automatic mood classification over conventional audio features.","PeriodicalId":212654,"journal":{"name":"Proceedings of the 20th ACM international conference on Multimedia","volume":"92 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130576868","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Unsupervised face-name association via commute distance 通过通勤距离产生的无监督面孔-名字关联
Proceedings of the 20th ACM international conference on Multimedia Pub Date : 2012-10-29 DOI: 10.1145/2393347.2393383
Jiajun Bu, Bin Xu, Chenxia Wu, Chun Chen, Jianke Zhu, Deng Cai, Xiaofei He
{"title":"Unsupervised face-name association via commute distance","authors":"Jiajun Bu, Bin Xu, Chenxia Wu, Chun Chen, Jianke Zhu, Deng Cai, Xiaofei He","doi":"10.1145/2393347.2393383","DOIUrl":"https://doi.org/10.1145/2393347.2393383","url":null,"abstract":"Recently, the task of unsupervised face-name association has received a considerable interests in multimedia and information retrieval communities. It is quite different with the generic facial image annotation problem because of its unsupervised and ambiguous assignment properties. Specifically, the task of face-name association should obey the following three constraints: (1) a face can only be assigned to a name appearing in its associated caption or to null; (2) a name can be assigned to at most one face; and (3) a face can be assigned to at most one name. Many conventional methods have been proposed to tackle this task while suffering from some common problems, eg, many of them are computational expensive and hard to make the null assignment decision. In this paper, we design a novel framework named face-name association via commute distance (FACD), which judges face-name and face-null assignments under a unified framework via commute distance (CD) algorithm. Then, to further speed up the on-line processing, we propose a novel anchor-based commute distance (ACD) algorithm whose main idea is using the anchor point representation structure to accelerate the eigen-decomposition of the adjacency matrix of a graph. Systematic experiment results on a large scale and real world image-caption database with a total of 194,046 detected faces and 244,725 names show that our proposed approach outperforms many state-of-the-art methods in performance. Our framework is appropriate for a large scale and real-time system.","PeriodicalId":212654,"journal":{"name":"Proceedings of the 20th ACM international conference on Multimedia","volume":"39 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130602473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Scaring or pleasing: exploit emotional impact of an image 惊吓或愉悦:利用图像的情感影响
Proceedings of the 20th ACM international conference on Multimedia Pub Date : 2012-10-29 DOI: 10.1145/2393347.2396487
Bing Li, Songhe Feng, Weihua Xiong, Weiming Hu
{"title":"Scaring or pleasing: exploit emotional impact of an image","authors":"Bing Li, Songhe Feng, Weihua Xiong, Weiming Hu","doi":"10.1145/2393347.2396487","DOIUrl":"https://doi.org/10.1145/2393347.2396487","url":null,"abstract":"Automatic image emotion analysis has emerged as a hot topic due to its potential application on high-level image understanding. Considering the fact that the emotion evoked by an image is not only from its global appearance but also interplays among local regions, we propose a novel affective image classification system based on bilayer sparse representation (BSR). The BSR model contains two layers: The global sparse representation (GSR) is to define global similarities between a test image and all the training images; and the local sparse representation (LSR) is to define similarities of local regions' appearances and their co-occurrence between a test image and all the training images. The experiments on real data sets demonstrate that our system is effective on image emotion recognition.","PeriodicalId":212654,"journal":{"name":"Proceedings of the 20th ACM international conference on Multimedia","volume":"82 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132319949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 43
Recognizing actions using depth motion maps-based histograms of oriented gradients 使用基于方向梯度的深度运动图直方图识别动作
Proceedings of the 20th ACM international conference on Multimedia Pub Date : 2012-10-29 DOI: 10.1145/2393347.2396382
Xiaodong Yang, Chenyang Zhang, Yingli Tian
{"title":"Recognizing actions using depth motion maps-based histograms of oriented gradients","authors":"Xiaodong Yang, Chenyang Zhang, Yingli Tian","doi":"10.1145/2393347.2396382","DOIUrl":"https://doi.org/10.1145/2393347.2396382","url":null,"abstract":"In this paper, we propose an effective method to recognize human actions from sequences of depth maps, which provide additional body shape and motion information for action recognition. In our approach, we project depth maps onto three orthogonal planes and accumulate global activities through entire video sequences to generate the Depth Motion Maps (DMM). Histograms of Oriented Gradients (HOG) are then computed from DMM as the representation of an action video. The recognition results on Microsoft Research (MSR) Action3D dataset show that our approach significantly outperforms the state-of-the-art methods, although our representation is much more compact. In addition, we investigate how many frames are required in our framework to recognize actions on the MSR Action3D dataset. We observe that a short sub-sequence of 30-35 frames is sufficient to achieve comparable results to that operating on entire video sequences.","PeriodicalId":212654,"journal":{"name":"Proceedings of the 20th ACM international conference on Multimedia","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131332589","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 565
Music/speech classification using high-level features derived from fmri brain imaging 使用源自fmri脑成像的高级特征进行音乐/语音分类
Proceedings of the 20th ACM international conference on Multimedia Pub Date : 2012-10-29 DOI: 10.1145/2393347.2396322
Xi Jiang, Tuo Zhang, Xintao Hu, Lie Lu, Junwei Han, Lei Guo, Tianming Liu
{"title":"Music/speech classification using high-level features derived from fmri brain imaging","authors":"Xi Jiang, Tuo Zhang, Xintao Hu, Lie Lu, Junwei Han, Lei Guo, Tianming Liu","doi":"10.1145/2393347.2396322","DOIUrl":"https://doi.org/10.1145/2393347.2396322","url":null,"abstract":"With the availability of large amount of audio tracks through a variety of sources and distribution channels, automatic music/speech classification becomes an indispensable tool in social audio websites and online audio communities. However, the accuracy of current acoustic-based low-level feature classification methods is still rather far from satisfaction. The discrepancy between the limited descriptive power of low-level features and the richness of high-level semantics perceived by the human brain has become the 'bottleneck' problem in audio signal analysis. In this paper, functional magnetic resonance imaging (fMRI) which monitors the human brain's response under the natural stimulus of music/speech listening is used as high-level features in the brain imaging space (BIS). We developed a computational framework to model the relationships between BIS features and low-level features in the training dataset with fMRI scans, predict BIS features of testing dataset without fMRI scans, and use the predicted BIS features for music/speech classification in the application stage. Experimental results demonstrated the significantly improved performance of music/speech classification via predicted BIS features than that via the original low-level features.","PeriodicalId":212654,"journal":{"name":"Proceedings of the 20th ACM international conference on Multimedia","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129265752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信