条件谐波混合的半监督学习

Semi-Supervised Learning Pub Date : 1900-01-01 DOI:10.7551/mitpress/9780262033589.003.0014

C. Burges, John C. Platt

{"title":"条件谐波混合的半监督学习","authors":"C. Burges, John C. Platt","doi":"10.7551/mitpress/9780262033589.003.0014","DOIUrl":null,"url":null,"abstract":"Recently graph-based algorithms, in which nodes represent data points and links encode similarities, have become popular for semi-supervised learning. In this chapter we introduce a general probabilistic formulation called `Conditional Harmonic Mixing’, in which the links are directed, a conditional probability matrix is associated with each link, and where the numbers of classes can vary from node to node. The posterior class probability at each node is updated by minimizing the KL divergence between its distribution and that predicted by its neighbours. We show that for arbitrary graphs, as long as each unlabeled point is reachable from at least one training point, a solution always exists, is unique, and can be found by solving a sparse linear system iteratively. This result holds even if the graph contains loops, or if the conditional probability matrices are not consistent. We show how, given a classifier for a task, CHM can learn its transition probabilities. Using the Reuters database, we show that CHM improves the accuracy of the best available classifier, for small training set sizes.","PeriodicalId":345393,"journal":{"name":"Semi-Supervised Learning","volume":"60 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"25","resultStr":"{\"title\":\"Semi-Supervised Learning with Conditional Harmonic Mixing\",\"authors\":\"C. Burges, John C. Platt\",\"doi\":\"10.7551/mitpress/9780262033589.003.0014\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recently graph-based algorithms, in which nodes represent data points and links encode similarities, have become popular for semi-supervised learning. In this chapter we introduce a general probabilistic formulation called `Conditional Harmonic Mixing’, in which the links are directed, a conditional probability matrix is associated with each link, and where the numbers of classes can vary from node to node. The posterior class probability at each node is updated by minimizing the KL divergence between its distribution and that predicted by its neighbours. We show that for arbitrary graphs, as long as each unlabeled point is reachable from at least one training point, a solution always exists, is unique, and can be found by solving a sparse linear system iteratively. This result holds even if the graph contains loops, or if the conditional probability matrices are not consistent. We show how, given a classifier for a task, CHM can learn its transition probabilities. Using the Reuters database, we show that CHM improves the accuracy of the best available classifier, for small training set sizes.\",\"PeriodicalId\":345393,\"journal\":{\"name\":\"Semi-Supervised Learning\",\"volume\":\"60 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"25\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Semi-Supervised Learning\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.7551/mitpress/9780262033589.003.0014\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Semi-Supervised Learning","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.7551/mitpress/9780262033589.003.0014","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 25

摘要

最近，基于图的算法，其中节点表示数据点，链接编码相似性，已经成为半监督学习的流行。在本章中，我们将介绍一个称为“条件谐波混合”的一般概率公式，其中链接是定向的，每个链接关联一个条件概率矩阵，并且类的数量可以从节点到节点变化。每个节点的后验类概率通过最小化其分布与相邻预测之间的KL散度来更新。我们证明了对于任意图，只要从至少一个训练点可以到达每个未标记点，解总是存在的，是唯一的，并且可以通过迭代求解一个稀疏线性系统找到。即使图包含循环，或者条件概率矩阵不一致，这个结果也成立。我们展示了如何，给定一个任务的分类器，CHM可以学习它的转移概率。使用路透社数据库，我们证明了CHM提高了最佳可用分类器的准确性，对于小的训练集大小。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Semi-Supervised Learning with Conditional Harmonic Mixing

Recently graph-based algorithms, in which nodes represent data points and links encode similarities, have become popular for semi-supervised learning. In this chapter we introduce a general probabilistic formulation called `Conditional Harmonic Mixing’, in which the links are directed, a conditional probability matrix is associated with each link, and where the numbers of classes can vary from node to node. The posterior class probability at each node is updated by minimizing the KL divergence between its distribution and that predicted by its neighbours. We show that for arbitrary graphs, as long as each unlabeled point is reachable from at least one training point, a solution always exists, is unique, and can be found by solving a sparse linear system iteratively. This result holds even if the graph contains loops, or if the conditional probability matrices are not consistent. We show how, given a classifier for a task, CHM can learn its transition probabilities. Using the Reuters database, we show that CHM improves the accuracy of the best available classifier, for small training set sizes.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Semi-Supervised Learning

自引率

0.00%

发文量