Research on Creating a Data Warehouse Based on E-Commerce

Gulzat Turken, Van Pey, Z. Abdiakhmetova, Zh. E. Temirbekova
{"title":"Research on Creating a Data Warehouse Based on E-Commerce","authors":"Gulzat Turken, Van Pey, Z. Abdiakhmetova, Zh. E. Temirbekova","doi":"10.1109/SIST58284.2023.10223542","DOIUrl":null,"url":null,"abstract":"With the popularization of the internet and the rapid development of science and technology, “online shopping” has become the norm in people's lives, and the e-commerce industry is booming, n addition, it has led to an increase in logistics. in today's business Wars, many companies strive for better development in enterprises of the same type, which continue to improve their information capabilities and level. This paper in order to solve the problems such as the increasing of massive data of e-commerce logistics and the phenomenon of data isolation in various business systems. The overall data warehouse is designed and constructed on the Hadoop cluster environment and data warehouse tool Hive is used to process data. Extraction of data from ETL, Sqoop and Flume tools is used for retrieving business data and log data and other aspects of ETL, we use Scala and Java to easily process and filter data and upload it to HDFS. The data warehouse is divided into levels and subject areas to simplify data management. Under the design of the entire system and data warehouse architecture, the conceptual, logical, and physical models of the data warehouse are developed and the star model is selected as a dimensional model. Finally, the application and implementation of data warehouse based on e-commerce logistics will be demonstrated. The development of a data warehouse based on e-commerce logistics not only ensures that e-commerce companies receive logistics information in a timely manner, but also forces decision makers to adjust logistics strategies in a timely manner based on data information, which can also improve user satisfaction and experience, and reduce costs.","PeriodicalId":367406,"journal":{"name":"2023 IEEE International Conference on Smart Information Systems and Technologies (SIST)","volume":"75 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE International Conference on Smart Information Systems and Technologies (SIST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SIST58284.2023.10223542","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

With the popularization of the internet and the rapid development of science and technology, “online shopping” has become the norm in people's lives, and the e-commerce industry is booming, n addition, it has led to an increase in logistics. in today's business Wars, many companies strive for better development in enterprises of the same type, which continue to improve their information capabilities and level. This paper in order to solve the problems such as the increasing of massive data of e-commerce logistics and the phenomenon of data isolation in various business systems. The overall data warehouse is designed and constructed on the Hadoop cluster environment and data warehouse tool Hive is used to process data. Extraction of data from ETL, Sqoop and Flume tools is used for retrieving business data and log data and other aspects of ETL, we use Scala and Java to easily process and filter data and upload it to HDFS. The data warehouse is divided into levels and subject areas to simplify data management. Under the design of the entire system and data warehouse architecture, the conceptual, logical, and physical models of the data warehouse are developed and the star model is selected as a dimensional model. Finally, the application and implementation of data warehouse based on e-commerce logistics will be demonstrated. The development of a data warehouse based on e-commerce logistics not only ensures that e-commerce companies receive logistics information in a timely manner, but also forces decision makers to adjust logistics strategies in a timely manner based on data information, which can also improve user satisfaction and experience, and reduce costs.
基于电子商务的数据仓库创建研究
随着互联网的普及和科技的飞速发展,“网上购物”已经成为人们生活中的常态,电子商务行业蓬勃发展,同时也带动了物流的增加。在商业战争的今天,很多企业在同类型企业中争取更好的发展,企业的信息化能力和水平不断提高。为了解决电子商务物流中海量数据的不断增加和各业务系统中数据隔离现象等问题。整个数据仓库是在Hadoop集群环境上设计和构建的,使用数据仓库工具Hive对数据进行处理。从ETL中提取数据,使用Sqoop和Flume工具检索ETL的业务数据和日志数据等方面,我们使用Scala和Java轻松处理和过滤数据并上传到HDFS。数据仓库被划分为级别和主题区域,以简化数据管理。在对整个系统和数据仓库体系结构进行设计的基础上,建立了数据仓库的概念模型、逻辑模型和物理模型,并选择星型模型作为维度模型。最后,阐述了基于电子商务物流的数据仓库的应用与实现。基于电子商务物流的数据仓库的发展,不仅保证了电子商务企业及时接收到物流信息,也迫使决策者根据数据信息及时调整物流策略,还可以提高用户满意度和体验,降低成本。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信