Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval最新文献_第6页

Learning Joint Embedding with Multimodal Cues for Cross-Modal Video-Text Retrieval 基于多模态线索的跨模态视频文本检索联合嵌入学习

Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval Pub Date : 2018-06-05 DOI: 10.1145/3206025.3206064

Niluthpol Chowdhury Mithun, Juncheng Billy Li, Florian Metze, A. Roy-Chowdhury

引用次数: 220

Supervised Nonparametric Multimodal Topic Modeling Methods for Multi-class Video Classification 多类视频分类的监督非参数多模态主题建模方法

Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval Pub Date : 2018-06-05 DOI: 10.1145/3206025.3206036

Jianfei Xue, K. Eguchi

引用次数: 1

Steganographer Detection based on Multiclass Dilated Residual Networks 基于多类扩展残差网络的隐写检测

Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval Pub Date : 2018-06-05 DOI: 10.1145/3206025.3206031

Mingjie Zheng, S. Zhong, Songtao Wu, Jianmin Jiang

{"title":"Steganographer Detection based on Multiclass Dilated Residual Networks","authors":"Mingjie Zheng, S. Zhong, Songtao Wu, Jianmin Jiang","doi":"10.1145/3206025.3206031","DOIUrl":"https://doi.org/10.1145/3206025.3206031","url":null,"abstract":"Steganographer detection task is to identify criminal users, who attempt to conceal confidential information by steganography methods, among a large number of innocent users. The significant challenge of the task is how to collect the evidences to identify the guilty user with suspicious images, which are embedded with secret messages generating by unknown steganography and payload. Unfortunately, existing methods for steganalysis were served for the binary classification. It makes them harder to classify the images with different kinds of payloads, especially when the payloads of images in test dataset have not been provided in advance. In this paper, we propose a novel steganographer detection method based on multiclass deep neural networks. In the training stage, the networks are trained to classify the images with six types of payloads. The networks can preserve even strengthen the weak stego signals from secret messages in much larger receptive filed by virtue of residual and dilated residual learning. In the inference stage, the learnt model is used to extract the discriminative features, which can capture the difference between guilty users and innocent users. A series of empirical experimental results demonstrate that the proposed method achieves good performance in spatial and frequency domains even though the embedding payload is low. The proposed method achieves a higher level of robustness of inter-steganographic algorithms and can provide a possible solution to address the payload mismatch problem","PeriodicalId":224132,"journal":{"name":"Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132527010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 11

Multimodal Network Embedding via Attention based Multi-view Variational Autoencoder 基于注意的多视图变分自编码器的多模态网络嵌入

Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval Pub Date : 2018-06-05 DOI: 10.1145/3206025.3206035

Feiran Huang, Xiaoming Zhang, Chaozhuo Li, Zhoujun Li, Yueying He, Zhonghua Zhao

{"title":"Multimodal Network Embedding via Attention based Multi-view Variational Autoencoder","authors":"Feiran Huang, Xiaoming Zhang, Chaozhuo Li, Zhoujun Li, Yueying He, Zhonghua Zhao","doi":"10.1145/3206025.3206035","DOIUrl":"https://doi.org/10.1145/3206025.3206035","url":null,"abstract":"Learning the embedding for social media data has attracted extensive research interests as well as boomed a lot of applications, such as classification and link prediction. In this paper, we examine the scenario of a multimodal network with nodes containing multimodal contents and connected by heterogeneous relationships, such as social images containing multimodal contents (e.g., visual content and text description), and linked with various forms (e.g., in the same album or with the same tag). However, given the multimodal network, simply learning the embedding from the network structure or a subset of content results in sub-optimal representation. In this paper, we propose a novel deep embedding method, i.e., Attention-based Multi-view Variational Auto-Encoder (AMVAE), to incorporate both the link information and the multimodal contents for more effective and efficient embedding. Specifically, we adopt LSTM with attention model to learn the correlation between different data modalities, such as the correlation between visual regions and the specific words, to obtain the semantic embedding of the multimodal contents. Then, the link information and the semantic embedding are considered as two correlated views. A multi-view correlation learning based Variational Auto-Encoder (VAE) is proposed to learn the representation of each node, in which the embedding of link information and multimodal contents are integrated and mutually reinforced. Experiments on three real-world datasets demonstrate the superiority of the proposed model in two applications, i.e., multi-label classification and link prediction.","PeriodicalId":224132,"journal":{"name":"Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval","volume":"191 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133847630","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 28

Collaborative Subspace Graph Hashing for Cross-modal Retrieval 跨模态检索的协同子空间图哈希

Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval Pub Date : 2018-06-05 DOI: 10.1145/3206025.3206042

Xiang Zhang, Guohua Dong, Yimo Du, Chengkun Wu, Zhigang Luo, Canqun Yang

{"title":"Collaborative Subspace Graph Hashing for Cross-modal Retrieval","authors":"Xiang Zhang, Guohua Dong, Yimo Du, Chengkun Wu, Zhigang Luo, Canqun Yang","doi":"10.1145/3206025.3206042","DOIUrl":"https://doi.org/10.1145/3206025.3206042","url":null,"abstract":"Current hashing methods for cross-modal retrieval generally attempt to learn the separate modality-specific transformation matrices to embed multi-modality data into a latent common subspace, and usually ignore the fact that respecting the diversity of multi-modality features in the latent subspace could be beneficial for retrieval improvements. To this, we propose a collaborative subspace graph hashing method (CSGH) to perform a two-stage collaborative learning framework for cross-modal retrieval. Particularly, CSGH first embeds multi-modality data into separate latent subspaces through individual modality-specific transformation matrices, and then connects these latent subspaces to a common Hamming space through a shared transformation matrix. In this framework, CSGH considers the modality-specific neighborhood structure and the cross-modal correlation within multi-modality data through the Laplacian regularization and the graph based correlation constraint, respectively. To solve CSGH, we develop an alternative procedure to optimize it, and fortunately, each sub-problem of CSGH has the elegant analytical solution. Experiments of cross-modal retrieval on Wiki, NUS-WIDE, Flickr25K and Flickr1M datasets show the effectiveness of CSGH compared with the state-of-the-art cross-modal hashing methods.","PeriodicalId":224132,"journal":{"name":"Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131601957","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 13

Multi-Scale Spatiotemporal Conv-LSTM Network for Video Saliency Detection 基于多尺度时空卷积lstm网络的视频显著性检测

Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval Pub Date : 2018-06-05 DOI: 10.1145/3206025.3206052

Yi Tang, Wenbin Zou, Zhi Jin, Xia Li

引用次数: 10

Prototyping for Envisioning the Future 设想未来的原型

Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval Pub Date : 2018-06-05 DOI: 10.1145/3206025.3210490

Yamanaka Shunji

{"title":"Prototyping for Envisioning the Future","authors":"Yamanaka Shunji","doi":"10.1145/3206025.3210490","DOIUrl":"https://doi.org/10.1145/3206025.3210490","url":null,"abstract":"As an industrial designer I have worked in collaboration with various researchers and scientists since the beginning of this century. I have made many prototypes showing the possibility of their leading edge technologies, and exhibited them in these years. As the archives of academic documents and papers have became open, and the internet gave public access to the recordings of various experiments being conducted throughout the world, technology in laboratories are now constantly exposed to the public. In this context, prototypes are becoming more important as the medium that bridges between advanced technology and society. Now a prototype is not merely an experimental machine. It is a device created to present user experience in advance, to share the benefits of the technology with many others. The role of a prototype is not limited to just sharing of values within the development team, but goes beyond that: it is a medium used to voice the significance of research and development to society; an inspiration to stimulate future markets; and also a tool to secure development budgets. A prototype is the physical embodiment of speculative story that connects people to technology that has yet to be brought to society. I would like to introduce some of the prototypes we developed and share the future vision they invoke.","PeriodicalId":224132,"journal":{"name":"Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123835529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Challenges and Opportunities within Personal Life Archives 个人生活档案中的挑战与机遇

Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval Pub Date : 2018-06-05 DOI: 10.1145/3206025.3206040

Duc-Tien Dang-Nguyen, M. Riegler, Liting Zhou, C. Gurrin

引用次数: 9

Towards Better Understanding of Player's Game Experience 更好地理解玩家的游戏体验

Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval Pub Date : 2018-06-05 DOI: 10.1145/3206025.3206072

Wenlu Yang, M. Rifqi, C. Marsala, Andrea Pinna

引用次数: 7

MOOCex

Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval Pub Date : 2018-06-05 DOI: 10.1145/3206025.3206087

Matthew Cooper, Jian Zhao, C. Bhatt, David A. Shamma

引用次数: 9