非平衡超图划分及共识聚类的改进

John Robert Yaros, T. Imielinski
{"title":"非平衡超图划分及共识聚类的改进","authors":"John Robert Yaros, T. Imielinski","doi":"10.1109/ICTAI.2013.61","DOIUrl":null,"url":null,"abstract":"Hypergraph partitioning is typically defined as an optimization problem wherein vertices are placed in separate parts (of a partition) such that the fewest number of hyperedges will span multiple parts. To ensure that parts have sizes satisfying user requirements, constraints are typically imposed. Under such constraints, the problem is known to be NP-Hard, so heuristic methods are needed to find approximate solutions in reasonable time. Circuit layout has historically been one of the most prominent application areas and has seen a proliferation of tools designed to satisfy its needs. Constraints in these tools typically focus on equal size parts, allowing the user to specify a maximum tolerance for deviation from that equal size. A more generalized constraint allows the user to define fixed sizes and tolerances for each part. More recently, other domains have mapped problems to hypergraph partitioning and, perhaps due to their availability, have used existing tools to perform partitioning. In particular, consensus clustering easily fits a hypergraph representation where each cluster of each input clustering is represented by a hyperedge. Authors of such research have reported partitioning tends to only have good results when clusters can be expected to be roughly the same size, an unsurprising result given the tools' focus on equal sized parts. Thus, even though many datasets have \"natural\" part sizes that are mixed, the current toolset is ill-suited to find good solutions unless those part sizes are known a priori. We argue that the main issue rests in the current constraint definitions and their focus measuring imbalance on the basis of the largest/smallest part. We further argue that, due to its holistic nature, entropy best measures imbalance and can best guide the partition method to the natural part sizes with lowest cut for a given level of imbalance. We provide a method that finds good approximate solutions under an entropy constraint and further introduce the notion of a discount cut, which helps overcome local optima that frequently plague k-way partitioning algorithms. In comparison to today's popular tools, we show our method returns sizable improvements in cut size as the level of imbalance grows. In consensus clustering, we demonstrate that good solutions are more easily achieved even when part sizes are not roughly equal.","PeriodicalId":140309,"journal":{"name":"2013 IEEE 25th International Conference on Tools with Artificial Intelligence","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":"{\"title\":\"Imbalanced Hypergraph Partitioning and Improvements for Consensus Clustering\",\"authors\":\"John Robert Yaros, T. Imielinski\",\"doi\":\"10.1109/ICTAI.2013.61\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Hypergraph partitioning is typically defined as an optimization problem wherein vertices are placed in separate parts (of a partition) such that the fewest number of hyperedges will span multiple parts. To ensure that parts have sizes satisfying user requirements, constraints are typically imposed. Under such constraints, the problem is known to be NP-Hard, so heuristic methods are needed to find approximate solutions in reasonable time. Circuit layout has historically been one of the most prominent application areas and has seen a proliferation of tools designed to satisfy its needs. Constraints in these tools typically focus on equal size parts, allowing the user to specify a maximum tolerance for deviation from that equal size. A more generalized constraint allows the user to define fixed sizes and tolerances for each part. More recently, other domains have mapped problems to hypergraph partitioning and, perhaps due to their availability, have used existing tools to perform partitioning. In particular, consensus clustering easily fits a hypergraph representation where each cluster of each input clustering is represented by a hyperedge. Authors of such research have reported partitioning tends to only have good results when clusters can be expected to be roughly the same size, an unsurprising result given the tools' focus on equal sized parts. Thus, even though many datasets have \\\"natural\\\" part sizes that are mixed, the current toolset is ill-suited to find good solutions unless those part sizes are known a priori. We argue that the main issue rests in the current constraint definitions and their focus measuring imbalance on the basis of the largest/smallest part. We further argue that, due to its holistic nature, entropy best measures imbalance and can best guide the partition method to the natural part sizes with lowest cut for a given level of imbalance. We provide a method that finds good approximate solutions under an entropy constraint and further introduce the notion of a discount cut, which helps overcome local optima that frequently plague k-way partitioning algorithms. In comparison to today's popular tools, we show our method returns sizable improvements in cut size as the level of imbalance grows. In consensus clustering, we demonstrate that good solutions are more easily achieved even when part sizes are not roughly equal.\",\"PeriodicalId\":140309,\"journal\":{\"name\":\"2013 IEEE 25th International Conference on Tools with Artificial Intelligence\",\"volume\":\"27 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-11-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 IEEE 25th International Conference on Tools with Artificial Intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICTAI.2013.61\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 IEEE 25th International Conference on Tools with Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICTAI.2013.61","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11

摘要

超图划分通常被定义为一个优化问题,其中顶点被放置在(分区的)独立部分,这样最少数量的超边将跨越多个部分。为了确保零件的尺寸满足用户需求,通常会施加约束。在这种约束下,已知问题是NP-Hard的,因此需要启发式方法在合理的时间内找到近似解。电路布局历来是最突出的应用领域之一,并且已经看到了为满足其需求而设计的工具的扩散。这些工具中的约束通常集中在等尺寸的零件上,允许用户指定偏离等尺寸的最大公差。更广义的约束允许用户为每个零件定义固定的尺寸和公差。最近,其他领域已经将问题映射到超图分区,并且可能由于它们的可用性,已经使用现有工具来执行分区。特别是,共识聚类很容易适合超图表示,其中每个输入聚类的每个聚类都由超边缘表示。此类研究的作者报告说,只有当集群的大小大致相同时,划分才会有好的结果,考虑到工具关注的是大小相等的部分,这一结果并不令人惊讶。因此,即使许多数据集具有混合的“自然”零件尺寸,当前的工具集也不适合找到好的解决方案,除非这些零件尺寸是先验已知的。我们认为,主要问题在于当前的约束定义,以及它们以最大/最小部分为基础衡量不平衡的重点。我们进一步认为,由于熵的整体性,熵可以最好地衡量不平衡,并且可以最好地指导分割方法在给定的不平衡水平下具有最低切割的自然部分尺寸。我们提供了一种在熵约束下找到好的近似解的方法,并进一步引入了折扣削减的概念,这有助于克服经常困扰k-way划分算法的局部最优解。与当今流行的工具相比,我们表明,随着不平衡水平的增长,我们的方法在切割尺寸方面得到了相当大的改进。在一致聚类中,我们证明了即使零件尺寸不大致相等,也更容易获得好的解决方案。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Imbalanced Hypergraph Partitioning and Improvements for Consensus Clustering
Hypergraph partitioning is typically defined as an optimization problem wherein vertices are placed in separate parts (of a partition) such that the fewest number of hyperedges will span multiple parts. To ensure that parts have sizes satisfying user requirements, constraints are typically imposed. Under such constraints, the problem is known to be NP-Hard, so heuristic methods are needed to find approximate solutions in reasonable time. Circuit layout has historically been one of the most prominent application areas and has seen a proliferation of tools designed to satisfy its needs. Constraints in these tools typically focus on equal size parts, allowing the user to specify a maximum tolerance for deviation from that equal size. A more generalized constraint allows the user to define fixed sizes and tolerances for each part. More recently, other domains have mapped problems to hypergraph partitioning and, perhaps due to their availability, have used existing tools to perform partitioning. In particular, consensus clustering easily fits a hypergraph representation where each cluster of each input clustering is represented by a hyperedge. Authors of such research have reported partitioning tends to only have good results when clusters can be expected to be roughly the same size, an unsurprising result given the tools' focus on equal sized parts. Thus, even though many datasets have "natural" part sizes that are mixed, the current toolset is ill-suited to find good solutions unless those part sizes are known a priori. We argue that the main issue rests in the current constraint definitions and their focus measuring imbalance on the basis of the largest/smallest part. We further argue that, due to its holistic nature, entropy best measures imbalance and can best guide the partition method to the natural part sizes with lowest cut for a given level of imbalance. We provide a method that finds good approximate solutions under an entropy constraint and further introduce the notion of a discount cut, which helps overcome local optima that frequently plague k-way partitioning algorithms. In comparison to today's popular tools, we show our method returns sizable improvements in cut size as the level of imbalance grows. In consensus clustering, we demonstrate that good solutions are more easily achieved even when part sizes are not roughly equal.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信