{"title":"使用高性能连接加速数据仓库中的ETL处理,用于更改数据捕获(CDC)","authors":"D. Tank, A. Ganatra, Y. Kosta, C. Bhensdadia","doi":"10.1109/ARTCOM.2010.63","DOIUrl":null,"url":null,"abstract":"In today's fast-changing, competitive environment, a complaint frequently heard by data warehouse users is that access to time-critical data is too slow. Shrinking batch windows and data volume that increases exponentially are placing increasing demands on data warehouses to deliver instantly-available information. Additionally, data warehouses must be able to consistently generate accurate results. But achieving accuracy and speed with large, diverse sets of data can be challenging. Various operations can be used to optimize data manipulation and thus accelerate data warehouse processes. In this paper we have introduced two such operations: 1. Join and 2. Aggregation – which will play an integral role during preprocessing as well in manipulating and consolidating data in a data warehouse. Our approach demonstrate how we can save hours or even days, when processing large amounts of data for ETL, data warehousing, business intelligence (BI) and other mission critical applications.","PeriodicalId":398854,"journal":{"name":"2010 International Conference on Advances in Recent Technologies in Communication and Computing","volume":"87 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"24","resultStr":"{\"title\":\"Speeding ETL Processing in Data Warehouses Using High-Performance Joins for Changed Data Capture (CDC)\",\"authors\":\"D. Tank, A. Ganatra, Y. Kosta, C. Bhensdadia\",\"doi\":\"10.1109/ARTCOM.2010.63\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In today's fast-changing, competitive environment, a complaint frequently heard by data warehouse users is that access to time-critical data is too slow. Shrinking batch windows and data volume that increases exponentially are placing increasing demands on data warehouses to deliver instantly-available information. Additionally, data warehouses must be able to consistently generate accurate results. But achieving accuracy and speed with large, diverse sets of data can be challenging. Various operations can be used to optimize data manipulation and thus accelerate data warehouse processes. In this paper we have introduced two such operations: 1. Join and 2. Aggregation – which will play an integral role during preprocessing as well in manipulating and consolidating data in a data warehouse. Our approach demonstrate how we can save hours or even days, when processing large amounts of data for ETL, data warehousing, business intelligence (BI) and other mission critical applications.\",\"PeriodicalId\":398854,\"journal\":{\"name\":\"2010 International Conference on Advances in Recent Technologies in Communication and Computing\",\"volume\":\"87 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-10-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"24\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2010 International Conference on Advances in Recent Technologies in Communication and Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ARTCOM.2010.63\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 International Conference on Advances in Recent Technologies in Communication and Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ARTCOM.2010.63","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Speeding ETL Processing in Data Warehouses Using High-Performance Joins for Changed Data Capture (CDC)
In today's fast-changing, competitive environment, a complaint frequently heard by data warehouse users is that access to time-critical data is too slow. Shrinking batch windows and data volume that increases exponentially are placing increasing demands on data warehouses to deliver instantly-available information. Additionally, data warehouses must be able to consistently generate accurate results. But achieving accuracy and speed with large, diverse sets of data can be challenging. Various operations can be used to optimize data manipulation and thus accelerate data warehouse processes. In this paper we have introduced two such operations: 1. Join and 2. Aggregation – which will play an integral role during preprocessing as well in manipulating and consolidating data in a data warehouse. Our approach demonstrate how we can save hours or even days, when processing large amounts of data for ETL, data warehousing, business intelligence (BI) and other mission critical applications.