G. Cao, Muhammad-Adeel Waris, Alexandros Iosifidis, M. Gabbouj
{"title":"基于dropout正则化的多模态子空间学习跨模态识别与检索","authors":"G. Cao, Muhammad-Adeel Waris, Alexandros Iosifidis, M. Gabbouj","doi":"10.1109/IPTA.2016.7821032","DOIUrl":null,"url":null,"abstract":"There has been a surge of efforts in cross-modal recognition and retrieval in recent multimedia research. Towards this goal, we investigate a multi-modal subspace learning algorithm together with the Dropout regularizer. Inspired by the regularization for neural networks, we propose to aritificially remove the effect of certain amount of feature bins using the probabilistic approach to prevent the linear subspace learning from over-fitting. The novel regularizer is well integrated into the multi-modal learning algorithm which maximizes the between-class scatter while minimizing the within-class scatter in the projected latent space. The new objective function can be solved efficiently as the generalized eigenvalue problem. Experimental results have shown that superior performance can be obtained in both face-sketch recognition and cross-modal retrieval applications.","PeriodicalId":123429,"journal":{"name":"2016 Sixth International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Multi-modal subspace learning with dropout regularization for cross-modal recognition and retrieval\",\"authors\":\"G. Cao, Muhammad-Adeel Waris, Alexandros Iosifidis, M. Gabbouj\",\"doi\":\"10.1109/IPTA.2016.7821032\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"There has been a surge of efforts in cross-modal recognition and retrieval in recent multimedia research. Towards this goal, we investigate a multi-modal subspace learning algorithm together with the Dropout regularizer. Inspired by the regularization for neural networks, we propose to aritificially remove the effect of certain amount of feature bins using the probabilistic approach to prevent the linear subspace learning from over-fitting. The novel regularizer is well integrated into the multi-modal learning algorithm which maximizes the between-class scatter while minimizing the within-class scatter in the projected latent space. The new objective function can be solved efficiently as the generalized eigenvalue problem. Experimental results have shown that superior performance can be obtained in both face-sketch recognition and cross-modal retrieval applications.\",\"PeriodicalId\":123429,\"journal\":{\"name\":\"2016 Sixth International Conference on Image Processing Theory, Tools and Applications (IPTA)\",\"volume\":\"8 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 Sixth International Conference on Image Processing Theory, Tools and Applications (IPTA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IPTA.2016.7821032\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 Sixth International Conference on Image Processing Theory, Tools and Applications (IPTA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPTA.2016.7821032","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Multi-modal subspace learning with dropout regularization for cross-modal recognition and retrieval
There has been a surge of efforts in cross-modal recognition and retrieval in recent multimedia research. Towards this goal, we investigate a multi-modal subspace learning algorithm together with the Dropout regularizer. Inspired by the regularization for neural networks, we propose to aritificially remove the effect of certain amount of feature bins using the probabilistic approach to prevent the linear subspace learning from over-fitting. The novel regularizer is well integrated into the multi-modal learning algorithm which maximizes the between-class scatter while minimizing the within-class scatter in the projected latent space. The new objective function can be solved efficiently as the generalized eigenvalue problem. Experimental results have shown that superior performance can be obtained in both face-sketch recognition and cross-modal retrieval applications.