{"title":"Efficient organization of large multidimensional arrays","authors":"Sunita Sarawagi, M. Stonebraker","doi":"10.1109/ICDE.1994.283048","DOIUrl":null,"url":null,"abstract":"Large multidimensional arrays are widely used in scientific and engineering database applications. The authors present methods of organizing arrays to make their access on secondary and tertiary memory devices fast and efficient. They have developed four techniques for doing this: (1) storing the array in multidimensional \"chunks\" to minimize the number of blocks fetched, (2) reordering the chunked array to minimize seek distance between accessed blocks, (3) maintaining redundant copies of the array, each organized for a different chunk size and ordering and (4) partitioning the array onto platters of a tertiary memory device so as to minimize the number of platter switches. The measurements on real data obtained from global change scientists show that accesses on arrays organized using these techniques are often an order of magnitude faster than on the unoptimized data.<<ETX>>","PeriodicalId":142465,"journal":{"name":"Proceedings of 1994 IEEE 10th International Conference on Data Engineering","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1994-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"306","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of 1994 IEEE 10th International Conference on Data Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDE.1994.283048","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 306
Abstract
Large multidimensional arrays are widely used in scientific and engineering database applications. The authors present methods of organizing arrays to make their access on secondary and tertiary memory devices fast and efficient. They have developed four techniques for doing this: (1) storing the array in multidimensional "chunks" to minimize the number of blocks fetched, (2) reordering the chunked array to minimize seek distance between accessed blocks, (3) maintaining redundant copies of the array, each organized for a different chunk size and ordering and (4) partitioning the array onto platters of a tertiary memory device so as to minimize the number of platter switches. The measurements on real data obtained from global change scientists show that accesses on arrays organized using these techniques are often an order of magnitude faster than on the unoptimized data.<>