{"title":"快速多模态统一稀疏表示学习","authors":"Mridula Verma, K. K. Shukla","doi":"10.1145/3078971.3079040","DOIUrl":null,"url":null,"abstract":"Exploiting feature sets belonging to different modalities helps in improving a significant amount of accuracy for the task of recognition. Given representations of an object in different modalities (e.g. image, text, audio etc.), to learn a unified representation of the object, has been a popular problem in the literature of multimedia retrieval. In this paper, we introduce a new iterative algorithm that learns the sparse unified representation with better accuracy in a lesser number of iterations than the previously reported results. Our algorithm employs a new fixed-point iterative scheme along with an inertial step. In order to obtain more discriminative representation, we also imposed a regularization term that utilizes the label information from the datasets. Experimental results on two real benchmark datasets demonstrate the efficacy of our method in terms of the number of iterations and accuracy.","PeriodicalId":403556,"journal":{"name":"Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval","volume":"47 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Fast Multi-Modal Unified Sparse Representation Learning\",\"authors\":\"Mridula Verma, K. K. Shukla\",\"doi\":\"10.1145/3078971.3079040\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Exploiting feature sets belonging to different modalities helps in improving a significant amount of accuracy for the task of recognition. Given representations of an object in different modalities (e.g. image, text, audio etc.), to learn a unified representation of the object, has been a popular problem in the literature of multimedia retrieval. In this paper, we introduce a new iterative algorithm that learns the sparse unified representation with better accuracy in a lesser number of iterations than the previously reported results. Our algorithm employs a new fixed-point iterative scheme along with an inertial step. In order to obtain more discriminative representation, we also imposed a regularization term that utilizes the label information from the datasets. Experimental results on two real benchmark datasets demonstrate the efficacy of our method in terms of the number of iterations and accuracy.\",\"PeriodicalId\":403556,\"journal\":{\"name\":\"Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval\",\"volume\":\"47 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-06-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3078971.3079040\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3078971.3079040","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Fast Multi-Modal Unified Sparse Representation Learning
Exploiting feature sets belonging to different modalities helps in improving a significant amount of accuracy for the task of recognition. Given representations of an object in different modalities (e.g. image, text, audio etc.), to learn a unified representation of the object, has been a popular problem in the literature of multimedia retrieval. In this paper, we introduce a new iterative algorithm that learns the sparse unified representation with better accuracy in a lesser number of iterations than the previously reported results. Our algorithm employs a new fixed-point iterative scheme along with an inertial step. In order to obtain more discriminative representation, we also imposed a regularization term that utilizes the label information from the datasets. Experimental results on two real benchmark datasets demonstrate the efficacy of our method in terms of the number of iterations and accuracy.