{"title":"基于网格的多属性记录聚类方法性能评价","authors":"Bhaskar Himatsingka, J. Srivastava","doi":"10.1109/ICDE.1994.283051","DOIUrl":null,"url":null,"abstract":"We focus on multi-attribute declustering methods which are based on some type of grid-based partitioning of the data space. Theoretical results are derived which show that no declustering method can be strictly optimal for range queries if the number of disks is greater than 5. A detailed performance evaluation is carried out to see how various declustering schemes perform under a wide range of query and database scenarios (both relative to each other and to the optimal). Parameters that are varied include shape and size of queries, database size, number of attributes and the number of disks. The results show that information about common queries on a relation is very important and ought to be used in deciding the declustering for it, and that this is especially crucial for small queries. Also, there is no clear winner, and as such parallel database systems must support a number of declustering methods.<<ETX>>","PeriodicalId":142465,"journal":{"name":"Proceedings of 1994 IEEE 10th International Conference on Data Engineering","volume":"56 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1994-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":"{\"title\":\"Performance evaluation of grid based multi-attribute record declustering methods\",\"authors\":\"Bhaskar Himatsingka, J. Srivastava\",\"doi\":\"10.1109/ICDE.1994.283051\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We focus on multi-attribute declustering methods which are based on some type of grid-based partitioning of the data space. Theoretical results are derived which show that no declustering method can be strictly optimal for range queries if the number of disks is greater than 5. A detailed performance evaluation is carried out to see how various declustering schemes perform under a wide range of query and database scenarios (both relative to each other and to the optimal). Parameters that are varied include shape and size of queries, database size, number of attributes and the number of disks. The results show that information about common queries on a relation is very important and ought to be used in deciding the declustering for it, and that this is especially crucial for small queries. Also, there is no clear winner, and as such parallel database systems must support a number of declustering methods.<<ETX>>\",\"PeriodicalId\":142465,\"journal\":{\"name\":\"Proceedings of 1994 IEEE 10th International Conference on Data Engineering\",\"volume\":\"56 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1994-02-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"14\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of 1994 IEEE 10th International Conference on Data Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDE.1994.283051\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of 1994 IEEE 10th International Conference on Data Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDE.1994.283051","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Performance evaluation of grid based multi-attribute record declustering methods
We focus on multi-attribute declustering methods which are based on some type of grid-based partitioning of the data space. Theoretical results are derived which show that no declustering method can be strictly optimal for range queries if the number of disks is greater than 5. A detailed performance evaluation is carried out to see how various declustering schemes perform under a wide range of query and database scenarios (both relative to each other and to the optimal). Parameters that are varied include shape and size of queries, database size, number of attributes and the number of disks. The results show that information about common queries on a relation is very important and ought to be used in deciding the declustering for it, and that this is especially crucial for small queries. Also, there is no clear winner, and as such parallel database systems must support a number of declustering methods.<>