Alaor Cervati Neto;Alexandre L. M. Levada;Michel Ferreira Cardia Haddad
{"title":"Supervised t-SNE for Metric Learning With Stochastic and Geodesic Distances","authors":"Alaor Cervati Neto;Alexandre L. M. Levada;Michel Ferreira Cardia Haddad","doi":"10.1109/ICJECE.2024.3429273","DOIUrl":null,"url":null,"abstract":"The t-distributed stochastic neighbor embedding (t-SNE) consists of a powerful algorithm for visualizing high-dimensional data in a lower dimensional space. It is extensively employed in machine learning (ML) and data analysis, including unsupervised metric learning. In this article, we propose improvements concerning two main aspects of the t-SNE. First, the incorporation of class labels is adopted to increase its suitability for supervised classification. Second, stochastic and geodesic distances are used as dissimilarity measures to avoid the dependence of the standard Euclidean distance, which is particularly sensitive to outliers. Computational experiments with several real-world datasets indicate that the proposed methodological approach is capable of improving classification accuracy compared with established methods. The results indicate a superior performance compared with the regular t-SNE and linear discriminant analysis (LDA), and a dependence on fewer parameters in comparison with the state-of-the-art supervised uniform manifold approximation and projection (UMAP) algorithm.","PeriodicalId":100619,"journal":{"name":"IEEE Canadian Journal of Electrical and Computer Engineering","volume":"47 4","pages":"199-205"},"PeriodicalIF":2.1000,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10734850","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Canadian Journal of Electrical and Computer Engineering","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10734850/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
The t-distributed stochastic neighbor embedding (t-SNE) consists of a powerful algorithm for visualizing high-dimensional data in a lower dimensional space. It is extensively employed in machine learning (ML) and data analysis, including unsupervised metric learning. In this article, we propose improvements concerning two main aspects of the t-SNE. First, the incorporation of class labels is adopted to increase its suitability for supervised classification. Second, stochastic and geodesic distances are used as dissimilarity measures to avoid the dependence of the standard Euclidean distance, which is particularly sensitive to outliers. Computational experiments with several real-world datasets indicate that the proposed methodological approach is capable of improving classification accuracy compared with established methods. The results indicate a superior performance compared with the regular t-SNE and linear discriminant analysis (LDA), and a dependence on fewer parameters in comparison with the state-of-the-art supervised uniform manifold approximation and projection (UMAP) algorithm.