{"title":"NPA:实时数据仓库中海量数据的增强分区方法","authors":"Jie Song, Y. Bao","doi":"10.1109/ITCS.2010.5581277","DOIUrl":null,"url":null,"abstract":"In many business and scientific data warehouses, not only the data amount is growing in geometric series, but also the requirement of real-time capability is increasing. Database partitioning technique which adopts ???divide and conquer??? method can efficiently simplify the complexity of managing massive data and improve the performance of the system, especially the range partitioning. The traditional range partitioning approach brings heavy burden to the system without a increased partitioning algorithm, so it does not adapt to the real-time data warehouse partitioning. To speed up the partitioning algorithm, the current partitioning technology is well studied and three effective range partitioning algorithms for the massive data are proposed, which based on allowing the fluctuation of data amount in each range of partitions. At last, some experiments and applications show that the proposed algorithms are more effective and efficient to partitioning and repartitioning tables in the real-time data warehouse.","PeriodicalId":166169,"journal":{"name":"2010 2nd International Conference on Information Technology Convergence and Services","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"NPA: Increased Partitioning Approach for Massive Data in Real-Time Data Warehouse\",\"authors\":\"Jie Song, Y. Bao\",\"doi\":\"10.1109/ITCS.2010.5581277\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In many business and scientific data warehouses, not only the data amount is growing in geometric series, but also the requirement of real-time capability is increasing. Database partitioning technique which adopts ???divide and conquer??? method can efficiently simplify the complexity of managing massive data and improve the performance of the system, especially the range partitioning. The traditional range partitioning approach brings heavy burden to the system without a increased partitioning algorithm, so it does not adapt to the real-time data warehouse partitioning. To speed up the partitioning algorithm, the current partitioning technology is well studied and three effective range partitioning algorithms for the massive data are proposed, which based on allowing the fluctuation of data amount in each range of partitions. At last, some experiments and applications show that the proposed algorithms are more effective and efficient to partitioning and repartitioning tables in the real-time data warehouse.\",\"PeriodicalId\":166169,\"journal\":{\"name\":\"2010 2nd International Conference on Information Technology Convergence and Services\",\"volume\":\"8 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-09-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2010 2nd International Conference on Information Technology Convergence and Services\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ITCS.2010.5581277\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 2nd International Conference on Information Technology Convergence and Services","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ITCS.2010.5581277","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
NPA: Increased Partitioning Approach for Massive Data in Real-Time Data Warehouse
In many business and scientific data warehouses, not only the data amount is growing in geometric series, but also the requirement of real-time capability is increasing. Database partitioning technique which adopts ???divide and conquer??? method can efficiently simplify the complexity of managing massive data and improve the performance of the system, especially the range partitioning. The traditional range partitioning approach brings heavy burden to the system without a increased partitioning algorithm, so it does not adapt to the real-time data warehouse partitioning. To speed up the partitioning algorithm, the current partitioning technology is well studied and three effective range partitioning algorithms for the massive data are proposed, which based on allowing the fluctuation of data amount in each range of partitions. At last, some experiments and applications show that the proposed algorithms are more effective and efficient to partitioning and repartitioning tables in the real-time data warehouse.