{"title":"An Efficient Empirical Solver for Localized Multiple Kernel Learning via DNNs","authors":"Ziming Zhang","doi":"10.1109/ICPR48806.2021.9411974","DOIUrl":null,"url":null,"abstract":"In this paper we propose solving localized multiple kernel learning (LMKL) using LMKL-Net, a feedforward deep neural network (DNN). In contrast to previous works, as a learning principle we propose parameterizing the gating function for learning kernel combination weights and the multiclass classifier using an attentional network (AN) and a multilayer perceptron (MLP), respectively. Such interpretability helps us better understand how the network solves the problem. Thanks to stochastic gradient descent (SGD), our approach has linear computational complexity in training. Empirically on benchmark datasets we demonstrate that with comparable or better accuracy than the state-of-the-art, our LMKL-Net can be trained about two orders of magnitude faster with about two orders of magnitude smaller memory footprint for large-scale learning.","PeriodicalId":6783,"journal":{"name":"2020 25th International Conference on Pattern Recognition (ICPR)","volume":"71 3 1","pages":"647-654"},"PeriodicalIF":0.0000,"publicationDate":"2021-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 25th International Conference on Pattern Recognition (ICPR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPR48806.2021.9411974","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
In this paper we propose solving localized multiple kernel learning (LMKL) using LMKL-Net, a feedforward deep neural network (DNN). In contrast to previous works, as a learning principle we propose parameterizing the gating function for learning kernel combination weights and the multiclass classifier using an attentional network (AN) and a multilayer perceptron (MLP), respectively. Such interpretability helps us better understand how the network solves the problem. Thanks to stochastic gradient descent (SGD), our approach has linear computational complexity in training. Empirically on benchmark datasets we demonstrate that with comparable or better accuracy than the state-of-the-art, our LMKL-Net can be trained about two orders of magnitude faster with about two orders of magnitude smaller memory footprint for large-scale learning.