Proceedings of the 20th ACM international conference on Multimedia最新文献_第4页

Session details: Full paper session 15: image content analysis 会议详情:全论文会议15:图像内容分析

Proceedings of the 20th ACM international conference on Multimedia Pub Date : 2012-10-29 DOI: 10.1145/3246408

Winston H. Hsu

引用次数: 0

Modalities consensus for multi-modal constraint propagation 多模态约束传播的模态一致性

Proceedings of the 20th ACM international conference on Multimedia Pub Date : 2012-10-29 DOI: 10.1145/2393347.2396309

Zhenyong Fu, Hongtao Lu, H. Ip, Zhiwu Lu

引用次数: 5

A tool for automatic cinemagraphs 一个自动制作电影的工具

Proceedings of the 20th ACM international conference on Multimedia Pub Date : 2012-10-29 DOI: 10.1145/2393347.2396431

Mei-Chen Yeh, Po-Yi Li

引用次数: 6

A multimedia analytics framework for browsing image collections in digital forensics 数字取证中用于浏览图像集合的多媒体分析框架

Proceedings of the 20th ACM international conference on Multimedia Pub Date : 2012-10-29 DOI: 10.1145/2393347.2393392

M. Worring, Andreas Engl, Camelia Smeria

引用次数: 18

Search web images using objects, backgrounds and conditions 使用对象、背景和条件搜索网络图像

Proceedings of the 20th ACM international conference on Multimedia Pub Date : 2012-10-29 DOI: 10.1145/2393347.2396350

Jiemi Zhang, Chenxia Wu, Deng Cai

引用次数: 0

Bilingual analysis of song lyrics and audio words 歌词和音频词的双语分析

Proceedings of the 20th ACM international conference on Multimedia Pub Date : 2012-10-29 DOI: 10.1145/2393347.2396323

Jen-Yu Liu, Chin-Chia Michael Yeh, Yi-Hsuan Yang, Yuan-Ching Teng

引用次数: 3

Unsupervised face-name association via commute distance 通过通勤距离产生的无监督面孔-名字关联

Proceedings of the 20th ACM international conference on Multimedia Pub Date : 2012-10-29 DOI: 10.1145/2393347.2393383

Jiajun Bu, Bin Xu, Chenxia Wu, Chun Chen, Jianke Zhu, Deng Cai, Xiaofei He

{"title":"Unsupervised face-name association via commute distance","authors":"Jiajun Bu, Bin Xu, Chenxia Wu, Chun Chen, Jianke Zhu, Deng Cai, Xiaofei He","doi":"10.1145/2393347.2393383","DOIUrl":"https://doi.org/10.1145/2393347.2393383","url":null,"abstract":"Recently, the task of unsupervised face-name association has received a considerable interests in multimedia and information retrieval communities. It is quite different with the generic facial image annotation problem because of its unsupervised and ambiguous assignment properties. Specifically, the task of face-name association should obey the following three constraints: (1) a face can only be assigned to a name appearing in its associated caption or to null; (2) a name can be assigned to at most one face; and (3) a face can be assigned to at most one name. Many conventional methods have been proposed to tackle this task while suffering from some common problems, eg, many of them are computational expensive and hard to make the null assignment decision. In this paper, we design a novel framework named face-name association via commute distance (FACD), which judges face-name and face-null assignments under a unified framework via commute distance (CD) algorithm. Then, to further speed up the on-line processing, we propose a novel anchor-based commute distance (ACD) algorithm whose main idea is using the anchor point representation structure to accelerate the eigen-decomposition of the adjacency matrix of a graph. Systematic experiment results on a large scale and real world image-caption database with a total of 194,046 detected faces and 244,725 names show that our proposed approach outperforms many state-of-the-art methods in performance. Our framework is appropriate for a large scale and real-time system.","PeriodicalId":212654,"journal":{"name":"Proceedings of the 20th ACM international conference on Multimedia","volume":"39 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130602473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 17

Scaring or pleasing: exploit emotional impact of an image 惊吓或愉悦:利用图像的情感影响

Proceedings of the 20th ACM international conference on Multimedia Pub Date : 2012-10-29 DOI: 10.1145/2393347.2396487

Bing Li, Songhe Feng, Weihua Xiong, Weiming Hu

引用次数: 43

Recognizing actions using depth motion maps-based histograms of oriented gradients 使用基于方向梯度的深度运动图直方图识别动作

Proceedings of the 20th ACM international conference on Multimedia Pub Date : 2012-10-29 DOI: 10.1145/2393347.2396382

Xiaodong Yang, Chenyang Zhang, Yingli Tian

引用次数: 565

Music/speech classification using high-level features derived from fmri brain imaging 使用源自fmri脑成像的高级特征进行音乐/语音分类

Proceedings of the 20th ACM international conference on Multimedia Pub Date : 2012-10-29 DOI: 10.1145/2393347.2396322

Xi Jiang, Tuo Zhang, Xintao Hu, Lie Lu, Junwei Han, Lei Guo, Tianming Liu

{"title":"Music/speech classification using high-level features derived from fmri brain imaging","authors":"Xi Jiang, Tuo Zhang, Xintao Hu, Lie Lu, Junwei Han, Lei Guo, Tianming Liu","doi":"10.1145/2393347.2396322","DOIUrl":"https://doi.org/10.1145/2393347.2396322","url":null,"abstract":"With the availability of large amount of audio tracks through a variety of sources and distribution channels, automatic music/speech classification becomes an indispensable tool in social audio websites and online audio communities. However, the accuracy of current acoustic-based low-level feature classification methods is still rather far from satisfaction. The discrepancy between the limited descriptive power of low-level features and the richness of high-level semantics perceived by the human brain has become the 'bottleneck' problem in audio signal analysis. In this paper, functional magnetic resonance imaging (fMRI) which monitors the human brain's response under the natural stimulus of music/speech listening is used as high-level features in the brain imaging space (BIS). We developed a computational framework to model the relationships between BIS features and low-level features in the training dataset with fMRI scans, predict BIS features of testing dataset without fMRI scans, and use the predicted BIS features for music/speech classification in the application stage. Experimental results demonstrated the significantly improved performance of music/speech classification via predicted BIS features than that via the original low-level features.","PeriodicalId":212654,"journal":{"name":"Proceedings of the 20th ACM international conference on Multimedia","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129265752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 15