Research on data mining and short-term power load forecasting based on Hadoop

International Conference on Mathematics, Modeling and Computer Science Pub Date : 2023-06-02 DOI:10.1117/12.2671337

Xiannan Huang, Juhua Hong, Shicheng Huang, Linyao Zhang, Lin Liu, Siyu Yang, Weiwei Lin

{"title":"Research on data mining and short-term power load forecasting based on Hadoop","authors":"Xiannan Huang, Juhua Hong, Shicheng Huang, Linyao Zhang, Lin Liu, Siyu Yang, Weiwei Lin","doi":"10.1117/12.2671337","DOIUrl":null,"url":null,"abstract":"With the rapid development of smart grid technology, the amount of power data information increases more and more, and the traditional centralized processing method has been unable to meet the requirements of power system operation. In order to better meet the storage and analysis goals of power data, researchers propose to use distributed power big data. On the basis of understanding the research status of data mining and load prediction algorithms, this paper focuses on the data composite characteristics and change rules of power AI competition according to the big data platform, and constructs the user clustering model with Mahout as the core and the multi-algorithm fusion prediction model with Spark as the core. Combined with the final research results, it is shown that Spark and Mahout, two frameworks in the Hadoop ecosystem, fully consider the advantages and disadvantages of different frameworks, and effectively control the time and cost of experimental analysis. The former can be regarded as a short-term composite forecasting platform, while the latter belongs to a data mining framework.","PeriodicalId":202840,"journal":{"name":"International Conference on Mathematics, Modeling and Computer Science","volume":"84 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Mathematics, Modeling and Computer Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1117/12.2671337","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

With the rapid development of smart grid technology, the amount of power data information increases more and more, and the traditional centralized processing method has been unable to meet the requirements of power system operation. In order to better meet the storage and analysis goals of power data, researchers propose to use distributed power big data. On the basis of understanding the research status of data mining and load prediction algorithms, this paper focuses on the data composite characteristics and change rules of power AI competition according to the big data platform, and constructs the user clustering model with Mahout as the core and the multi-algorithm fusion prediction model with Spark as the core. Combined with the final research results, it is shown that Spark and Mahout, two frameworks in the Hadoop ecosystem, fully consider the advantages and disadvantages of different frameworks, and effectively control the time and cost of experimental analysis. The former can be regarded as a short-term composite forecasting platform, while the latter belongs to a data mining framework.

查看原文本刊更多论文

基于Hadoop的数据挖掘与短期电力负荷预测研究

随着智能电网技术的快速发展，电力数据信息量越来越大，传统的集中处理方法已经不能满足电力系统运行的要求。为了更好地满足电力数据的存储和分析目标，研究人员提出使用分布式电力大数据。在了解数据挖掘和负荷预测算法研究现状的基础上，根据大数据平台，重点研究电力AI竞争的数据复合特征和变化规律，构建了以Mahout为核心的用户聚类模型和以Spark为核心的多算法融合预测模型。结合最终的研究结果表明，Hadoop生态系统中的两种框架Spark和Mahout充分考虑了不同框架的优缺点，有效地控制了实验分析的时间和成本。前者可视为短期复合预测平台，后者属于数据挖掘框架。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International Conference on Mathematics, Modeling and Computer Science

自引率

0.00%

发文量