A Co-training Approach based TEF-WA technique

Tang Huanling, Lu Mingyu, Liu Na
{"title":"A Co-training Approach based TEF-WA technique","authors":"Tang Huanling, Lu Mingyu, Liu Na","doi":"10.1109/NPC.2007.104","DOIUrl":null,"url":null,"abstract":"Traditional categorization algorithm suffers from not having sufficient labeled training data for learning, while large amount unlabeled data are easily available. We investigate co-training algorithm and its assumption that the features set can be split into two compatible and independent views. However, the assumption is usually violated to some degree in practice and sometimes the natural feature split does not exist. So we adopt TEF_WA technique which utilizes term evaluation functions to split features set and construct multiple views. We can choose a pair of views which are compatible and independent to certain degree. Based TEF_WA technique we develop a semi-supervised categorization algorithm Co_CLM. Experimental results show Co_CLM can significantly decrease the classification error utilizing unlabeled data especially labeled data is sparse. Our experimental results also indicate Co_CLM will achieve more satisfactory performance with the more independent view pairs.","PeriodicalId":278518,"journal":{"name":"2007 IFIP International Conference on Network and Parallel Computing Workshops (NPC 2007)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2007 IFIP International Conference on Network and Parallel Computing Workshops (NPC 2007)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NPC.2007.104","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Traditional categorization algorithm suffers from not having sufficient labeled training data for learning, while large amount unlabeled data are easily available. We investigate co-training algorithm and its assumption that the features set can be split into two compatible and independent views. However, the assumption is usually violated to some degree in practice and sometimes the natural feature split does not exist. So we adopt TEF_WA technique which utilizes term evaluation functions to split features set and construct multiple views. We can choose a pair of views which are compatible and independent to certain degree. Based TEF_WA technique we develop a semi-supervised categorization algorithm Co_CLM. Experimental results show Co_CLM can significantly decrease the classification error utilizing unlabeled data especially labeled data is sparse. Our experimental results also indicate Co_CLM will achieve more satisfactory performance with the more independent view pairs.
基于TEF-WA技术的协同训练方法
传统的分类算法缺乏足够的标记训练数据进行学习,而大量的未标记数据又很容易获得。我们研究了协同训练算法及其假设特征集可以分成两个兼容且独立的视图。然而,在实际操作中往往会在一定程度上违背这一假设,有时自然特征分割并不存在。为此,我们采用TEF_WA技术,利用项评价函数拆分特征集,构造多个视图。我们可以选择一对既兼容又一定程度独立的观点。基于TEF_WA技术,提出了一种半监督分类算法Co_CLM。实验结果表明,Co_CLM可以显著降低未标记数据的分类误差,特别是标记数据是稀疏的。我们的实验结果也表明,Co_CLM在具有更多独立视图对的情况下可以获得更令人满意的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信