基于概率和图模型的遗传算法驱动的实例级约束聚类

Yi Hong, S. Kwong, Hanli Wang, Qingsheng Ren, Yuchou Chang
{"title":"基于概率和图模型的遗传算法驱动的实例级约束聚类","authors":"Yi Hong, S. Kwong, Hanli Wang, Qingsheng Ren, Yuchou Chang","doi":"10.1109/CEC.2008.4630817","DOIUrl":null,"url":null,"abstract":"Clustering is traditionally viewed as an unsupervised method for data analysis. However, several recent studies have shown that some limited prior instance-level knowledge can significantly improve the performance of clustering algorithm. This paper proposes a semi-supervised clustering algorithm termed as the probabilistic and graphical model based genetic algorithm driven clustering with instance-level constraints (Cop-CGA). In Cop-CGA, all prior knowledge about pairs of instances that should or should not be classified into the same groups is denoted as a graph and all candidate clustering solutions are sampled from this graph with different orders to assign instances into a certain number of groups. We illustrate how to design the Cop-CGA to guarantee that all candidate solutions satisfy the given constraints and demonstrate the usefulness of background knowledge for genetic algorithm driven clustering algorithm through experiments on several real data sets with artificial hard constraints. One advantage of Cop-CGA is both positive and negative instance-level constraints can be easily incorporated. Moreover, the performance of Cop-CGA is not sensitive to the order of assignment of instances to groups.","PeriodicalId":328803,"journal":{"name":"2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence)","volume":"14 1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"Probabilistic and Graphical Model based Genetic Algorithm Driven Clustering with Instance-level Constraints\",\"authors\":\"Yi Hong, S. Kwong, Hanli Wang, Qingsheng Ren, Yuchou Chang\",\"doi\":\"10.1109/CEC.2008.4630817\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Clustering is traditionally viewed as an unsupervised method for data analysis. However, several recent studies have shown that some limited prior instance-level knowledge can significantly improve the performance of clustering algorithm. This paper proposes a semi-supervised clustering algorithm termed as the probabilistic and graphical model based genetic algorithm driven clustering with instance-level constraints (Cop-CGA). In Cop-CGA, all prior knowledge about pairs of instances that should or should not be classified into the same groups is denoted as a graph and all candidate clustering solutions are sampled from this graph with different orders to assign instances into a certain number of groups. We illustrate how to design the Cop-CGA to guarantee that all candidate solutions satisfy the given constraints and demonstrate the usefulness of background knowledge for genetic algorithm driven clustering algorithm through experiments on several real data sets with artificial hard constraints. One advantage of Cop-CGA is both positive and negative instance-level constraints can be easily incorporated. Moreover, the performance of Cop-CGA is not sensitive to the order of assignment of instances to groups.\",\"PeriodicalId\":328803,\"journal\":{\"name\":\"2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence)\",\"volume\":\"14 1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CEC.2008.4630817\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CEC.2008.4630817","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9

摘要

聚类传统上被认为是一种无监督的数据分析方法。然而,最近的一些研究表明,一些有限的先验实例级知识可以显著提高聚类算法的性能。本文提出了一种半监督聚类算法,称为基于概率和图形模型的遗传算法驱动实例级约束聚类(Cop-CGA)。在Cop-CGA中,关于应该或不应该被分类到同一组的实例对的所有先验知识表示为一个图,并从该图中以不同的顺序采样所有候选聚类解决方案,以将实例分配到一定数量的组中。我们通过在几个具有人工硬约束的真实数据集上的实验,说明了如何设计Cop-CGA以保证所有候选解满足给定的约束,并证明了背景知识对遗传算法驱动的聚类算法的有用性。Cop-CGA的一个优点是可以很容易地合并正面和负面实例级约束。此外,Cop-CGA的性能对实例分配给组的顺序不敏感。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Probabilistic and Graphical Model based Genetic Algorithm Driven Clustering with Instance-level Constraints
Clustering is traditionally viewed as an unsupervised method for data analysis. However, several recent studies have shown that some limited prior instance-level knowledge can significantly improve the performance of clustering algorithm. This paper proposes a semi-supervised clustering algorithm termed as the probabilistic and graphical model based genetic algorithm driven clustering with instance-level constraints (Cop-CGA). In Cop-CGA, all prior knowledge about pairs of instances that should or should not be classified into the same groups is denoted as a graph and all candidate clustering solutions are sampled from this graph with different orders to assign instances into a certain number of groups. We illustrate how to design the Cop-CGA to guarantee that all candidate solutions satisfy the given constraints and demonstrate the usefulness of background knowledge for genetic algorithm driven clustering algorithm through experiments on several real data sets with artificial hard constraints. One advantage of Cop-CGA is both positive and negative instance-level constraints can be easily incorporated. Moreover, the performance of Cop-CGA is not sensitive to the order of assignment of instances to groups.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信