{"title":"利用集群计算支持自动和动态的数据库集群","authors":"Sylvain Guinepain, L. Gruenwald","doi":"10.1109/CLUSTR.2008.4663800","DOIUrl":null,"url":null,"abstract":"Query response time is the number one metrics when it comes to database performance. Because of data proliferation, efficient access methods and data storage techniques have become increasingly critical to maintain an acceptable query response time. Retrieving data from disk is several orders of magnitude slower than retrieving it from memory, it is easy to see the direct correlation between query response time and the number of disk I/Os. One of the common ways to reduce disk I/Os and therefore improve query response time is database clustering, which is a process that partitions the database vertically (attribute clustering) and/or horizontally (record clustering). A clustering is optimized for a given set of queries. However in dynamic systems the queries change with time, the clustering in place becomes obsolete, and the database needs to be re-clustered dynamically. This paper presents an efficient algorithm for attribute clustering that dynamically and automatically generates attribute clusters based on closed item sets mined from the attributes sets found in the queries running against the database. The paper then discusses how this algorithm can be implemented using the cluster computing paradigm to reduce query response time even further through parallelism and data redundancy.","PeriodicalId":198768,"journal":{"name":"2008 IEEE International Conference on Cluster Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":"{\"title\":\"Using cluster computing to support automatic and dynamic database clustering\",\"authors\":\"Sylvain Guinepain, L. Gruenwald\",\"doi\":\"10.1109/CLUSTR.2008.4663800\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Query response time is the number one metrics when it comes to database performance. Because of data proliferation, efficient access methods and data storage techniques have become increasingly critical to maintain an acceptable query response time. Retrieving data from disk is several orders of magnitude slower than retrieving it from memory, it is easy to see the direct correlation between query response time and the number of disk I/Os. One of the common ways to reduce disk I/Os and therefore improve query response time is database clustering, which is a process that partitions the database vertically (attribute clustering) and/or horizontally (record clustering). A clustering is optimized for a given set of queries. However in dynamic systems the queries change with time, the clustering in place becomes obsolete, and the database needs to be re-clustered dynamically. This paper presents an efficient algorithm for attribute clustering that dynamically and automatically generates attribute clusters based on closed item sets mined from the attributes sets found in the queries running against the database. The paper then discusses how this algorithm can be implemented using the cluster computing paradigm to reduce query response time even further through parallelism and data redundancy.\",\"PeriodicalId\":198768,\"journal\":{\"name\":\"2008 IEEE International Conference on Cluster Computing\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-10-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"14\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2008 IEEE International Conference on Cluster Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CLUSTR.2008.4663800\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 IEEE International Conference on Cluster Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CLUSTR.2008.4663800","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Using cluster computing to support automatic and dynamic database clustering
Query response time is the number one metrics when it comes to database performance. Because of data proliferation, efficient access methods and data storage techniques have become increasingly critical to maintain an acceptable query response time. Retrieving data from disk is several orders of magnitude slower than retrieving it from memory, it is easy to see the direct correlation between query response time and the number of disk I/Os. One of the common ways to reduce disk I/Os and therefore improve query response time is database clustering, which is a process that partitions the database vertically (attribute clustering) and/or horizontally (record clustering). A clustering is optimized for a given set of queries. However in dynamic systems the queries change with time, the clustering in place becomes obsolete, and the database needs to be re-clustered dynamically. This paper presents an efficient algorithm for attribute clustering that dynamically and automatically generates attribute clusters based on closed item sets mined from the attributes sets found in the queries running against the database. The paper then discusses how this algorithm can be implemented using the cluster computing paradigm to reduce query response time even further through parallelism and data redundancy.