Text classification based on semi-supervised learning

2013 International Conference on Soft Computing and Pattern Recognition (SoCPaR) Pub Date : 2013-12-01 DOI:10.1109/SOCPAR.2013.7054133

Vo Duy Thanh, P. M. Tuan, V. T. Hung, Doan Van Ban

引用次数: 7

Abstract

In this paper, we present our solution and experimental results of the application of semi-supervised machine learning techniques and the improvement of SVM algorithm to build text classification applications. Firstly, we create a features model which is based on labeled data, and then we will be improved it by the unlabeled data. The technique that is to be added a label into new data is based on binary classification. Our experiment is implemented on three data layers which are extracted from papers in three topics sports, entertainment and education on VNEXPRESS.NET. We experimented and compared the accuracy of the classification results between before and after improve features model through semi-supervised machine learning method and classification algorithm based on SVM model. Experiments show that classification quality is enhanced after improvement features model.

查看原文本刊更多论文

基于半监督学习的文本分类

在本文中，我们提出了应用半监督机器学习技术和改进SVM算法构建文本分类应用的解决方案和实验结果。首先建立基于标记数据的特征模型，然后利用未标记数据对模型进行改进。将标签添加到新数据中的技术是基于二元分类的。我们的实验是在三个数据层上进行的，这些数据层是从VNEXPRESS.NET上的体育、娱乐和教育三个主题的论文中抽取的。我们通过半监督机器学习方法和基于SVM模型的分类算法对改进特征模型前后的分类结果的准确率进行了实验和比较。实验表明，改进特征模型后，分类质量得到了提高。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2013 International Conference on Soft Computing and Pattern Recognition (SoCPaR)

自引率

0.00%

发文量