Extract-Load-Transform (ELT) Process Runtime Analysis and Optimization

A. E. Zvonarev, D.S. Gudilin, Dmitriy A. Lychagin, B. Goryachkin
{"title":"Extract-Load-Transform (ELT) Process Runtime Analysis and Optimization","authors":"A. E. Zvonarev, D.S. Gudilin, Dmitriy A. Lychagin, B. Goryachkin","doi":"10.1109/REEPE57272.2023.10086728","DOIUrl":null,"url":null,"abstract":"The article discusses algorithms for optimizing the transformation stage of the ELT process, built on the basis of procedures in the PostgreSQL DBMS and the parallelization mechanism implemented by the Python programming language tools. As a basis for comparison, the most trivial version of the data conversion process was taken, which consists in a sequential connection of each individual procedure. The first proposed algorithm uses the principle of the simplest parallelization of procedures, which allows you to perform independent procedures in parallel. The second algorithm is an improved version of the first one. It uses the principle of step-by-step optimization with additional parallelization of chain blocks of dependent procedures. As the main criterion for evaluation, the time of execution of the entire chain of procedures was taken. As a result of the study, it was determined that the improved version of the procedure parallelization algorithm shows the shortest execution time of the entire chain of the data transformation step.","PeriodicalId":356187,"journal":{"name":"2023 5th International Youth Conference on Radio Electronics, Electrical and Power Engineering (REEPE)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 5th International Youth Conference on Radio Electronics, Electrical and Power Engineering (REEPE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/REEPE57272.2023.10086728","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The article discusses algorithms for optimizing the transformation stage of the ELT process, built on the basis of procedures in the PostgreSQL DBMS and the parallelization mechanism implemented by the Python programming language tools. As a basis for comparison, the most trivial version of the data conversion process was taken, which consists in a sequential connection of each individual procedure. The first proposed algorithm uses the principle of the simplest parallelization of procedures, which allows you to perform independent procedures in parallel. The second algorithm is an improved version of the first one. It uses the principle of step-by-step optimization with additional parallelization of chain blocks of dependent procedures. As the main criterion for evaluation, the time of execution of the entire chain of procedures was taken. As a result of the study, it was determined that the improved version of the procedure parallelization algorithm shows the shortest execution time of the entire chain of the data transformation step.
提取-负载-转换(ELT)过程运行时分析与优化
本文以PostgreSQL数据库中的过程和Python编程语言工具实现的并行化机制为基础,讨论了ELT过程转换阶段的优化算法。作为比较的基础,采用了数据转换过程的最简单版本,它由每个单独过程的顺序连接组成。第一个提出的算法使用最简单的过程并行化原则,它允许您并行地执行独立的过程。第二种算法是第一种算法的改进版本。它使用逐步优化的原则,并对依赖过程的链块进行额外的并行化。以整个流程链的执行时间作为评价的主要标准。研究结果表明,改进版本的过程并行化算法在数据转换步骤的整个链中执行时间最短。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信