{"title":"基于CDC和Union的近实时ETL","authors":"N. Mohammed Muddasir, K. Raghuveer","doi":"10.1109/ICECIT.2017.8453374","DOIUrl":null,"url":null,"abstract":"Data warehouse refreshment is a challenging task today as tactical decisions are based on real-time data. To make the availability of real-time transaction data at the data warehouse near real-time techniques are employed. These techniques are based on incremental extraction i.e. the extraction of recent changes and applying intelligence at the transaction site to fetch only records that are useful for analysis. Our idea is based on the hypothesis that we could do the analysis of data from the transaction database but we do not run analysis queries on transaction database mainly because of disparate sources of transaction data and the additional load put on transaction database because of executing analysis queries. If the changed data is less in size but has a significant impact on analysis results we could not afford to lose it neither could we be able to move the changes because of the size. So we have a challenge in a few incremental changes that have a significant impact on results of analysis because these changes could not be moved either could we run analysis on transaction database. To resolve this we come up with a novel approach based on change data capture mechanism.","PeriodicalId":331200,"journal":{"name":"2017 2nd International Conference On Emerging Computation and Information Technologies (ICECIT)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"CDC and Union based near real time ETL\",\"authors\":\"N. Mohammed Muddasir, K. Raghuveer\",\"doi\":\"10.1109/ICECIT.2017.8453374\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Data warehouse refreshment is a challenging task today as tactical decisions are based on real-time data. To make the availability of real-time transaction data at the data warehouse near real-time techniques are employed. These techniques are based on incremental extraction i.e. the extraction of recent changes and applying intelligence at the transaction site to fetch only records that are useful for analysis. Our idea is based on the hypothesis that we could do the analysis of data from the transaction database but we do not run analysis queries on transaction database mainly because of disparate sources of transaction data and the additional load put on transaction database because of executing analysis queries. If the changed data is less in size but has a significant impact on analysis results we could not afford to lose it neither could we be able to move the changes because of the size. So we have a challenge in a few incremental changes that have a significant impact on results of analysis because these changes could not be moved either could we run analysis on transaction database. To resolve this we come up with a novel approach based on change data capture mechanism.\",\"PeriodicalId\":331200,\"journal\":{\"name\":\"2017 2nd International Conference On Emerging Computation and Information Technologies (ICECIT)\",\"volume\":\"16 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 2nd International Conference On Emerging Computation and Information Technologies (ICECIT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICECIT.2017.8453374\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 2nd International Conference On Emerging Computation and Information Technologies (ICECIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICECIT.2017.8453374","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Data warehouse refreshment is a challenging task today as tactical decisions are based on real-time data. To make the availability of real-time transaction data at the data warehouse near real-time techniques are employed. These techniques are based on incremental extraction i.e. the extraction of recent changes and applying intelligence at the transaction site to fetch only records that are useful for analysis. Our idea is based on the hypothesis that we could do the analysis of data from the transaction database but we do not run analysis queries on transaction database mainly because of disparate sources of transaction data and the additional load put on transaction database because of executing analysis queries. If the changed data is less in size but has a significant impact on analysis results we could not afford to lose it neither could we be able to move the changes because of the size. So we have a challenge in a few incremental changes that have a significant impact on results of analysis because these changes could not be moved either could we run analysis on transaction database. To resolve this we come up with a novel approach based on change data capture mechanism.