社会图像分类的多模态学习

2016 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD) Pub Date : 2016-08-01 DOI:10.1109/FSKD.2016.7603345

Chunyang Liu, Xu Zhang, Xiong Li, Rui Li, Xiaoming Zhang, Wen-Han Chao

{"title":"社会图像分类的多模态学习","authors":"Chunyang Liu, Xu Zhang, Xiong Li, Rui Li, Xiaoming Zhang, Wen-Han Chao","doi":"10.1109/FSKD.2016.7603345","DOIUrl":null,"url":null,"abstract":"There is growing interest in social image classification because of its importance in web-based image application. Though there are many approaches on image classification, it is a great problem to integrate multi-modal content of social images simultaneously for social image classification, since the textual content and visual content are represented in two heterogeneous feature spaces. In this study, we proposed a multi-modal learning algorithm to fuse the multiple features through their correlation seamlessly. Specifically, we learn two linear classification modules for the two types of feature, and then they are integrated by the l2 normalization via a joint model. With the joint model, the classification based on visual feature can be reinforced by the classification based on textual feature, and vice verse. Then, the test image can be classified based on both the textual feature and visual feature by combing the results of the two classifiers. The evaluate the approach, we conduct some experiments on real-world datasets, and the result shows the superiority of our proposed algorithm against the baselines.","PeriodicalId":373155,"journal":{"name":"2016 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD)","volume":"61 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Multi-modal learning for social image classification\",\"authors\":\"Chunyang Liu, Xu Zhang, Xiong Li, Rui Li, Xiaoming Zhang, Wen-Han Chao\",\"doi\":\"10.1109/FSKD.2016.7603345\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"There is growing interest in social image classification because of its importance in web-based image application. Though there are many approaches on image classification, it is a great problem to integrate multi-modal content of social images simultaneously for social image classification, since the textual content and visual content are represented in two heterogeneous feature spaces. In this study, we proposed a multi-modal learning algorithm to fuse the multiple features through their correlation seamlessly. Specifically, we learn two linear classification modules for the two types of feature, and then they are integrated by the l2 normalization via a joint model. With the joint model, the classification based on visual feature can be reinforced by the classification based on textual feature, and vice verse. Then, the test image can be classified based on both the textual feature and visual feature by combing the results of the two classifiers. The evaluate the approach, we conduct some experiments on real-world datasets, and the result shows the superiority of our proposed algorithm against the baselines.\",\"PeriodicalId\":373155,\"journal\":{\"name\":\"2016 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD)\",\"volume\":\"61 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/FSKD.2016.7603345\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FSKD.2016.7603345","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

由于社会图像分类在基于网络的图像应用中的重要性，人们对其越来越感兴趣。虽然图像分类的方法很多，但由于文本内容和视觉内容是在两个异构的特征空间中表示的，因此如何将社会图像的多模态内容同时集成到社会图像分类中是一个很大的问题。在本研究中，我们提出了一种多模态学习算法，通过它们之间的相关性无缝融合多个特征。具体来说，我们对两种类型的特征学习两个线性分类模块，然后通过联合模型进行l2归一化积分。在联合模型中，基于视觉特征的分类可以被基于文本特征的分类所加强，反之亦然。然后，结合两种分类器的分类结果，对测试图像进行文本特征和视觉特征的分类。为了评估该方法，我们在实际数据集上进行了一些实验，结果表明了我们提出的算法相对于基线的优越性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Multi-modal learning for social image classification

There is growing interest in social image classification because of its importance in web-based image application. Though there are many approaches on image classification, it is a great problem to integrate multi-modal content of social images simultaneously for social image classification, since the textual content and visual content are represented in two heterogeneous feature spaces. In this study, we proposed a multi-modal learning algorithm to fuse the multiple features through their correlation seamlessly. Specifically, we learn two linear classification modules for the two types of feature, and then they are integrated by the l2 normalization via a joint model. With the joint model, the classification based on visual feature can be reinforced by the classification based on textual feature, and vice verse. Then, the test image can be classified based on both the textual feature and visual feature by combing the results of the two classifiers. The evaluate the approach, we conduct some experiments on real-world datasets, and the result shows the superiority of our proposed algorithm against the baselines.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2016 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD)

自引率

0.00%

发文量