DATA PIPELINE ARCHITECTURE FOR ACADEMIC INFORMATION SYSTEM AT AKADEMI TEKNIK BIAK

Journal of Intelligent Software Systems Pub Date : 2024-07-18 DOI:10.26798/jiss.v3i1.1335

Heman Koreri Israel Mnsen, Bambang Purnomosidi, Rikie Kartadie, Didi Kurnaedi

{"title":"DATA PIPELINE ARCHITECTURE FOR ACADEMIC INFORMATION SYSTEM AT AKADEMI TEKNIK BIAK","authors":"Heman Koreri Israel Mnsen, Bambang Purnomosidi, Rikie Kartadie, Didi Kurnaedi","doi":"10.26798/jiss.v3i1.1335","DOIUrl":null,"url":null,"abstract":"In development a information system Intergrated, Architecture planning is the first step must be established. The planning of development in a information system is needed in order to a system can be running according to necessity. The data is used for this research, that is internal data of Biak Technical Academy College and external data of Institution of high education service at IV area in Biak Papua. The main goal of this research is design architecture pipelines data of ATB college. The architecture of pipelines is used for carrying resources of big data from one area to the other area in far distance to be efficiency. The method is used for this research, that is Estract Transform Load (ETL). The process of estract data is needed a special supporting library on apache spark in using library spark session. This spark session is established in order to call data of Biak Technical Academy college with csv extension can be run on apache spark. After the process of estract is established, apache spark will read data with csv extension and establish transform data. The process of transform data csv extension will be loaded in to a frame data as a output of processing ETL The result of research is apache spark technology can be easy for writers in design process information system of Biak Technical Academy and to be one of the best solution in processing Estract Load Transform (ETL) data with the big scale and real-time","PeriodicalId":156799,"journal":{"name":"Journal of Intelligent Software Systems","volume":" 20","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Intelligent Software Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.26798/jiss.v3i1.1335","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

In development a information system Intergrated, Architecture planning is the first step must be established. The planning of development in a information system is needed in order to a system can be running according to necessity. The data is used for this research, that is internal data of Biak Technical Academy College and external data of Institution of high education service at IV area in Biak Papua. The main goal of this research is design architecture pipelines data of ATB college. The architecture of pipelines is used for carrying resources of big data from one area to the other area in far distance to be efficiency. The method is used for this research, that is Estract Transform Load (ETL). The process of estract data is needed a special supporting library on apache spark in using library spark session. This spark session is established in order to call data of Biak Technical Academy college with csv extension can be run on apache spark. After the process of estract is established, apache spark will read data with csv extension and establish transform data. The process of transform data csv extension will be loaded in to a frame data as a output of processing ETL The result of research is apache spark technology can be easy for writers in design process information system of Biak Technical Academy and to be one of the best solution in processing Estract Load Transform (ETL) data with the big scale and real-time

查看原文本刊更多论文

比亚克大学学术信息系统的数据管道架构

在开发综合信息系统的过程中，架构规划是必须确立的第一步。信息系统的开发规划是必要的，这样系统才能根据需要运行。本研究使用的数据是比亚克技术学院的内部数据和比亚克巴布亚第四区高等教育服务机构的外部数据。本研究的主要目标是设计比亚克技术学院的管道数据架构。管道架构用于将大数据资源从一个区域传输到距离较远的另一个区域，以提高效率。本研究采用的方法是数据提取转换加载（ETL）。在提取数据的过程中，需要在 apache spark 上使用一个特殊的支持库，即 spark 会话库。建立spark会话的目的是为了在apache spark上调用比亚克技术学院的csv扩展数据。在estract过程建立后，apache spark将读取带有csv扩展名的数据并建立转换数据。研究结果表明，apache spark 技术在设计比亚克技术学院的流程信息系统时可以为作者提供方便，并且是处理大规模和实时ETL 数据的最佳解决方案之一。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Intelligent Software Systems

自引率

0.00%

发文量