半监督广义判别分析。

IEEE transactions on neural networks Pub Date : 2011-08-01 Epub Date: 2011-06-30 DOI:10.1109/TNN.2011.2156808

Yu Zhang, Dit-Yan Yeung

{"title":"半监督广义判别分析。","authors":"Yu Zhang, Dit-Yan Yeung","doi":"10.1109/TNN.2011.2156808","DOIUrl":null,"url":null,"abstract":"Generalized discriminant analysis (GDA) is a commonly used method for dimensionality reduction. In its general form, it seeks a nonlinear projection that simultaneously maximizes the between-class dissimilarity and minimizes the within-class dissimilarity to increase class separability. In real-world applications where labeled data are scarce, GDA may not work very well. However, unlabeled data are often available in large quantities at very low cost. In this paper, we propose a novel GDA algorithm which is abbreviated as semisupervised generalized discriminant analysis (SSGDA). We utilize unlabeled data to maximize an optimality criterion of GDA and formulate the problem as an optimization problem that is solved using the constrained concave-convex procedure. The optimization procedure leads to estimation of the class labels for the unlabeled data. We propose a novel confidence measure and a method for selecting those unlabeled data points whose labels are estimated with high confidence. The selected unlabeled data can then be used to augment the original labeled dataset for performing GDA. We also propose a variant of SSGDA, called M-SSGDA, which adopts the manifold assumption to utilize the unlabeled data. Extensive experiments on many benchmark datasets demonstrate the effectiveness of our proposed methods.","PeriodicalId":13434,"journal":{"name":"IEEE transactions on neural networks","volume":"22 8","pages":"1207-17"},"PeriodicalIF":0.0000,"publicationDate":"2011-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TNN.2011.2156808","citationCount":"33","resultStr":"{\"title\":\"Semisupervised generalized discriminant analysis.\",\"authors\":\"Yu Zhang, Dit-Yan Yeung\",\"doi\":\"10.1109/TNN.2011.2156808\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Generalized discriminant analysis (GDA) is a commonly used method for dimensionality reduction. In its general form, it seeks a nonlinear projection that simultaneously maximizes the between-class dissimilarity and minimizes the within-class dissimilarity to increase class separability. In real-world applications where labeled data are scarce, GDA may not work very well. However, unlabeled data are often available in large quantities at very low cost. In this paper, we propose a novel GDA algorithm which is abbreviated as semisupervised generalized discriminant analysis (SSGDA). We utilize unlabeled data to maximize an optimality criterion of GDA and formulate the problem as an optimization problem that is solved using the constrained concave-convex procedure. The optimization procedure leads to estimation of the class labels for the unlabeled data. We propose a novel confidence measure and a method for selecting those unlabeled data points whose labels are estimated with high confidence. The selected unlabeled data can then be used to augment the original labeled dataset for performing GDA. We also propose a variant of SSGDA, called M-SSGDA, which adopts the manifold assumption to utilize the unlabeled data. Extensive experiments on many benchmark datasets demonstrate the effectiveness of our proposed methods.\",\"PeriodicalId\":13434,\"journal\":{\"name\":\"IEEE transactions on neural networks\",\"volume\":\"22 8\",\"pages\":\"1207-17\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1109/TNN.2011.2156808\",\"citationCount\":\"33\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on neural networks\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/TNN.2011.2156808\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2011/6/30 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on neural networks","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TNN.2011.2156808","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2011/6/30 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 33

摘要

广义判别分析(GDA)是一种常用的降维方法。在其一般形式中，它寻求一种非线性投影，同时最大化阶级之间的不相似性和最小化阶级内部的不相似性，以增加阶级的可分离性。在标记数据稀缺的实际应用程序中，GDA可能不能很好地工作。然而，未标记的数据通常以极低的成本大量获得。本文提出了一种新的广义判别分析算法，简称为半监督广义判别分析(SSGDA)。我们利用未标记的数据来最大化GDA的最优性准则，并将问题表述为使用约束凹凸过程求解的优化问题。优化过程导致对未标记数据的类标签的估计。我们提出了一种新的置信度度量和一种选择那些标签估计具有高置信度的未标记数据点的方法。然后可以使用所选的未标记数据来扩展原始标记数据集以执行GDA。我们还提出了一种SSGDA的变体，称为M-SSGDA，它采用流形假设来利用未标记数据。在许多基准数据集上的大量实验证明了我们提出的方法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Semisupervised generalized discriminant analysis.

Generalized discriminant analysis (GDA) is a commonly used method for dimensionality reduction. In its general form, it seeks a nonlinear projection that simultaneously maximizes the between-class dissimilarity and minimizes the within-class dissimilarity to increase class separability. In real-world applications where labeled data are scarce, GDA may not work very well. However, unlabeled data are often available in large quantities at very low cost. In this paper, we propose a novel GDA algorithm which is abbreviated as semisupervised generalized discriminant analysis (SSGDA). We utilize unlabeled data to maximize an optimality criterion of GDA and formulate the problem as an optimization problem that is solved using the constrained concave-convex procedure. The optimization procedure leads to estimation of the class labels for the unlabeled data. We propose a novel confidence measure and a method for selecting those unlabeled data points whose labels are estimated with high confidence. The selected unlabeled data can then be used to augment the original labeled dataset for performing GDA. We also propose a variant of SSGDA, called M-SSGDA, which adopts the manifold assumption to utilize the unlabeled data. Extensive experiments on many benchmark datasets demonstrate the effectiveness of our proposed methods.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE transactions on neural networks 工程技术-工程：电子与电气

自引率

0.00%

发文量

审稿时长

8.7 months