{"title":"MyETL: A Java Software Tool to Extract, Transform, and Load Your Business","authors":"Nuovo Michele","doi":"10.1515/cris-2015-0011","DOIUrl":null,"url":null,"abstract":"Abstract The project follows the development of a Java Software Tool that extracts data from Flat File (Fixed Length Record Type), CSV (Comma Separated Values), and XLS (Microsoft Excel 97-2003 Worksheet file), apply transformation to those sources, and finally load the data into the end target RDBMS. The software refers to a process known as ETL (Extract Transform and Load). Those kinds of systems are called ETL systems. The analysis involved research on the theory behind the ETL process as well as the theory behind the various phases of the applied methodology. Also an in-depth look at the design and architecture of the software has been made. To create a complete design needed to be used for the implementation, different techniques and diagrams where used to visualise and refine ideas: UML class diagrams, System Architecture Diagrams, Physical Data Model, and Project Timeline. The implementation of the project involved the translation of the system architecture into working software using the Extreme Programming Methodology and the Java programming language. A mapping algorithm module and design patterns have been used in the implementation phase. A transformation syntax has been defined to achieve data transformation. The testing of the software was done in the form of a unit test. A formal test plan was prepared to ensure that the main features of the system worked as defined. An error handling code implementation has been developed to avoid an unexpected crash of the system and to communicate to the user problems or errors.","PeriodicalId":440425,"journal":{"name":"CRIS - Bulletin of the Centre for Research and Interdisciplinary Study","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"CRIS - Bulletin of the Centre for Research and Interdisciplinary Study","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1515/cris-2015-0011","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Abstract The project follows the development of a Java Software Tool that extracts data from Flat File (Fixed Length Record Type), CSV (Comma Separated Values), and XLS (Microsoft Excel 97-2003 Worksheet file), apply transformation to those sources, and finally load the data into the end target RDBMS. The software refers to a process known as ETL (Extract Transform and Load). Those kinds of systems are called ETL systems. The analysis involved research on the theory behind the ETL process as well as the theory behind the various phases of the applied methodology. Also an in-depth look at the design and architecture of the software has been made. To create a complete design needed to be used for the implementation, different techniques and diagrams where used to visualise and refine ideas: UML class diagrams, System Architecture Diagrams, Physical Data Model, and Project Timeline. The implementation of the project involved the translation of the system architecture into working software using the Extreme Programming Methodology and the Java programming language. A mapping algorithm module and design patterns have been used in the implementation phase. A transformation syntax has been defined to achieve data transformation. The testing of the software was done in the form of a unit test. A formal test plan was prepared to ensure that the main features of the system worked as defined. An error handling code implementation has been developed to avoid an unexpected crash of the system and to communicate to the user problems or errors.