Xiannan Huang, Juhua Hong, Shicheng Huang, Linyao Zhang, Lin Liu, Siyu Yang, Weiwei Lin
{"title":"Research on data mining and short-term power load forecasting based on Hadoop","authors":"Xiannan Huang, Juhua Hong, Shicheng Huang, Linyao Zhang, Lin Liu, Siyu Yang, Weiwei Lin","doi":"10.1117/12.2671337","DOIUrl":null,"url":null,"abstract":"With the rapid development of smart grid technology, the amount of power data information increases more and more, and the traditional centralized processing method has been unable to meet the requirements of power system operation. In order to better meet the storage and analysis goals of power data, researchers propose to use distributed power big data. On the basis of understanding the research status of data mining and load prediction algorithms, this paper focuses on the data composite characteristics and change rules of power AI competition according to the big data platform, and constructs the user clustering model with Mahout as the core and the multi-algorithm fusion prediction model with Spark as the core. Combined with the final research results, it is shown that Spark and Mahout, two frameworks in the Hadoop ecosystem, fully consider the advantages and disadvantages of different frameworks, and effectively control the time and cost of experimental analysis. The former can be regarded as a short-term composite forecasting platform, while the latter belongs to a data mining framework.","PeriodicalId":202840,"journal":{"name":"International Conference on Mathematics, Modeling and Computer Science","volume":"84 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Mathematics, Modeling and Computer Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1117/12.2671337","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
With the rapid development of smart grid technology, the amount of power data information increases more and more, and the traditional centralized processing method has been unable to meet the requirements of power system operation. In order to better meet the storage and analysis goals of power data, researchers propose to use distributed power big data. On the basis of understanding the research status of data mining and load prediction algorithms, this paper focuses on the data composite characteristics and change rules of power AI competition according to the big data platform, and constructs the user clustering model with Mahout as the core and the multi-algorithm fusion prediction model with Spark as the core. Combined with the final research results, it is shown that Spark and Mahout, two frameworks in the Hadoop ecosystem, fully consider the advantages and disadvantages of different frameworks, and effectively control the time and cost of experimental analysis. The former can be regarded as a short-term composite forecasting platform, while the latter belongs to a data mining framework.