A Co-training Approach based TEF-WA technique

2007 IFIP International Conference on Network and Parallel Computing Workshops (NPC 2007) Pub Date : 2007-09-18 DOI:10.1109/NPC.2007.104

Tang Huanling, Lu Mingyu, Liu Na

引用次数: 2

Abstract

Traditional categorization algorithm suffers from not having sufficient labeled training data for learning, while large amount unlabeled data are easily available. We investigate co-training algorithm and its assumption that the features set can be split into two compatible and independent views. However, the assumption is usually violated to some degree in practice and sometimes the natural feature split does not exist. So we adopt TEF_WA technique which utilizes term evaluation functions to split features set and construct multiple views. We can choose a pair of views which are compatible and independent to certain degree. Based TEF_WA technique we develop a semi-supervised categorization algorithm Co_CLM. Experimental results show Co_CLM can significantly decrease the classification error utilizing unlabeled data especially labeled data is sparse. Our experimental results also indicate Co_CLM will achieve more satisfactory performance with the more independent view pairs.

查看原文本刊更多论文

基于TEF-WA技术的协同训练方法

传统的分类算法缺乏足够的标记训练数据进行学习，而大量的未标记数据又很容易获得。我们研究了协同训练算法及其假设特征集可以分成两个兼容且独立的视图。然而，在实际操作中往往会在一定程度上违背这一假设，有时自然特征分割并不存在。为此，我们采用TEF_WA技术，利用项评价函数拆分特征集，构造多个视图。我们可以选择一对既兼容又一定程度独立的观点。基于TEF_WA技术，提出了一种半监督分类算法Co_CLM。实验结果表明，Co_CLM可以显著降低未标记数据的分类误差，特别是标记数据是稀疏的。我们的实验结果也表明，Co_CLM在具有更多独立视图对的情况下可以获得更令人满意的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2007 IFIP International Conference on Network and Parallel Computing Workshops (NPC 2007)

自引率

0.00%

发文量