Vu-Tuan Dang, V. Vu, Hong-Quan Do, Thi Kieu Oanh Le
{"title":"GRAPH BASED CLUSTERING WITH CONSTRAINTS AND ACTIVE LEARNING","authors":"Vu-Tuan Dang, V. Vu, Hong-Quan Do, Thi Kieu Oanh Le","doi":"10.15625/1813-9663/37/1/15773","DOIUrl":null,"url":null,"abstract":"During the past few years, semi-supervised clustering has emerged as a new interesting direction in machine learning research. In a semi-supervised clustering algorithm, the clustering results can be significantly improved by using side information, which is available or collected from users. There are two main kinds of side information that can be learned in semi-supervised clustering algorithms including class labels(seeds) or pairwise constraints. In this paper, we propose a semisupervised graph based clustering algorithm that tries to use seeds and constraints in the clustering process, called MCSSGC. Moreover, we also introduce a simple but efficient active learning method to collect the constraints that can boost the performance of MCSSGC, named KMMFFQS. These obtained results show that the proposed algorithm can significantly improve the clustering process compared to some recent algorithms.","PeriodicalId":15444,"journal":{"name":"Journal of Computer Science and Cybernetics","volume":"51 1","pages":"71-89"},"PeriodicalIF":0.0000,"publicationDate":"2021-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computer Science and Cybernetics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.15625/1813-9663/37/1/15773","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
During the past few years, semi-supervised clustering has emerged as a new interesting direction in machine learning research. In a semi-supervised clustering algorithm, the clustering results can be significantly improved by using side information, which is available or collected from users. There are two main kinds of side information that can be learned in semi-supervised clustering algorithms including class labels(seeds) or pairwise constraints. In this paper, we propose a semisupervised graph based clustering algorithm that tries to use seeds and constraints in the clustering process, called MCSSGC. Moreover, we also introduce a simple but efficient active learning method to collect the constraints that can boost the performance of MCSSGC, named KMMFFQS. These obtained results show that the proposed algorithm can significantly improve the clustering process compared to some recent algorithms.