A genetic algorithm-based clustering approach for database partitioning

C. Cheng, W. Lee, Kam-Fai Wong
{"title":"A genetic algorithm-based clustering approach for database partitioning","authors":"C. Cheng, W. Lee, Kam-Fai Wong","doi":"10.1109/TSMCC.2002.804444","DOIUrl":null,"url":null,"abstract":"In a typical distributed/parallel database system, a request mostly accesses a subset of the entire database. It is, therefore, natural to organize commonly accessed data together and to place them on nearby, preferably the same, machine(s)/site(s). For this reason, data partitioning and data allocation are performance critical issues in distributed database application design. We are dealing with data partitioning. Data partitioning requires the use of clustering. Although many clustering algorithms have been proposed, their performance has not been extensively studied. Moreover, the special problem structure in clustering is rarely exploited. We explore the use of a genetic search-based clustering algorithm for data partitioning to achieve high database retrieval performance. By formulating the underlying problem as a traveling salesman problem (TSP), we can take advantage of this particular structure. Three new operators for GAs are also proposed and experimental results indicate that they outperform other operators in solving the TSP. The proposed GA is applied to solve the data-partitioning problem. Our computational study shows that our GA performs well for this application.","PeriodicalId":55005,"journal":{"name":"IEEE Transactions on Systems Man and Cybernetics Part C-Applications and Re","volume":"20 6 1","pages":"215-230"},"PeriodicalIF":0.0000,"publicationDate":"2002-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"117","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Systems Man and Cybernetics Part C-Applications and Re","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TSMCC.2002.804444","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 117

Abstract

In a typical distributed/parallel database system, a request mostly accesses a subset of the entire database. It is, therefore, natural to organize commonly accessed data together and to place them on nearby, preferably the same, machine(s)/site(s). For this reason, data partitioning and data allocation are performance critical issues in distributed database application design. We are dealing with data partitioning. Data partitioning requires the use of clustering. Although many clustering algorithms have been proposed, their performance has not been extensively studied. Moreover, the special problem structure in clustering is rarely exploited. We explore the use of a genetic search-based clustering algorithm for data partitioning to achieve high database retrieval performance. By formulating the underlying problem as a traveling salesman problem (TSP), we can take advantage of this particular structure. Three new operators for GAs are also proposed and experimental results indicate that they outperform other operators in solving the TSP. The proposed GA is applied to solve the data-partitioning problem. Our computational study shows that our GA performs well for this application.
一种基于遗传算法的数据库分区聚类方法
在典型的分布式/并行数据库系统中,请求通常访问整个数据库的一个子集。因此,很自然地将经常访问的数据组织在一起,并将它们放在附近(最好是相同的)机器/站点上。因此,数据分区和数据分配是分布式数据库应用程序设计中的性能关键问题。我们正在处理数据分区。数据分区需要使用集群。虽然已经提出了许多聚类算法,但它们的性能并没有得到广泛的研究。此外,聚类中特殊的问题结构很少被利用。我们探索使用基于遗传搜索的聚类算法进行数据分区,以实现高数据库检索性能。通过将潜在问题表述为旅行推销员问题(TSP),我们可以利用这种特殊的结构。本文还提出了三种新的GAs算子,实验结果表明,它们在求解TSP方面优于其他算子。将所提出的遗传算法应用于数据分区问题。我们的计算研究表明,我们的遗传算法在这种应用中表现良好。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
1
审稿时长
3 months
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信