{"title":"基于事件的近实时数据集成体系结构","authors":"M. Naeem, G. Dobbie, Gerald Weber","doi":"10.1109/EDOCW.2008.14","DOIUrl":null,"url":null,"abstract":"Extract-transform-load (ETL) tools feed data from operational databases into data warehouses. Traditionally, these ETL tools use batch processing and operate offline at regular time intervals, for example on a nightly or weekly basis. Naturally, users prefer to have up-to-date data to make their decisions, therefore there is a demand for real-time ETL tools. In this paper we investigate an event-based near real-time ETL layer for transferring and transforming data from the operational database to the data warehouse. One of our main concerns in this paper is master data management in the ETL layer. We present the architecture of a novel, general purpose, event-driven, and near real-time ETL layer that uses a database queue (DBQ), works on a push technology principle and directly supports content enrichment. We also observe that the system architecture is consistent with the information architecture of a classical online transaction processing (OLTP) application, allowing us to distinguish between different kinds of data to increase the clarity of the design.","PeriodicalId":205960,"journal":{"name":"2008 12th Enterprise Distributed Object Computing Conference Workshops","volume":"53 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"49","resultStr":"{\"title\":\"An Event-Based Near Real-Time Data Integration Architecture\",\"authors\":\"M. Naeem, G. Dobbie, Gerald Weber\",\"doi\":\"10.1109/EDOCW.2008.14\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Extract-transform-load (ETL) tools feed data from operational databases into data warehouses. Traditionally, these ETL tools use batch processing and operate offline at regular time intervals, for example on a nightly or weekly basis. Naturally, users prefer to have up-to-date data to make their decisions, therefore there is a demand for real-time ETL tools. In this paper we investigate an event-based near real-time ETL layer for transferring and transforming data from the operational database to the data warehouse. One of our main concerns in this paper is master data management in the ETL layer. We present the architecture of a novel, general purpose, event-driven, and near real-time ETL layer that uses a database queue (DBQ), works on a push technology principle and directly supports content enrichment. We also observe that the system architecture is consistent with the information architecture of a classical online transaction processing (OLTP) application, allowing us to distinguish between different kinds of data to increase the clarity of the design.\",\"PeriodicalId\":205960,\"journal\":{\"name\":\"2008 12th Enterprise Distributed Object Computing Conference Workshops\",\"volume\":\"53 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-09-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"49\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2008 12th Enterprise Distributed Object Computing Conference Workshops\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/EDOCW.2008.14\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 12th Enterprise Distributed Object Computing Conference Workshops","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/EDOCW.2008.14","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An Event-Based Near Real-Time Data Integration Architecture
Extract-transform-load (ETL) tools feed data from operational databases into data warehouses. Traditionally, these ETL tools use batch processing and operate offline at regular time intervals, for example on a nightly or weekly basis. Naturally, users prefer to have up-to-date data to make their decisions, therefore there is a demand for real-time ETL tools. In this paper we investigate an event-based near real-time ETL layer for transferring and transforming data from the operational database to the data warehouse. One of our main concerns in this paper is master data management in the ETL layer. We present the architecture of a novel, general purpose, event-driven, and near real-time ETL layer that uses a database queue (DBQ), works on a push technology principle and directly supports content enrichment. We also observe that the system architecture is consistent with the information architecture of a classical online transaction processing (OLTP) application, allowing us to distinguish between different kinds of data to increase the clarity of the design.