Xiannan Huang, Juhua Hong, Shicheng Huang, Linyao Zhang, Lin Liu, Siyu Yang, Weiwei Lin
{"title":"基于Hadoop的数据挖掘与短期电力负荷预测研究","authors":"Xiannan Huang, Juhua Hong, Shicheng Huang, Linyao Zhang, Lin Liu, Siyu Yang, Weiwei Lin","doi":"10.1117/12.2671337","DOIUrl":null,"url":null,"abstract":"With the rapid development of smart grid technology, the amount of power data information increases more and more, and the traditional centralized processing method has been unable to meet the requirements of power system operation. In order to better meet the storage and analysis goals of power data, researchers propose to use distributed power big data. On the basis of understanding the research status of data mining and load prediction algorithms, this paper focuses on the data composite characteristics and change rules of power AI competition according to the big data platform, and constructs the user clustering model with Mahout as the core and the multi-algorithm fusion prediction model with Spark as the core. Combined with the final research results, it is shown that Spark and Mahout, two frameworks in the Hadoop ecosystem, fully consider the advantages and disadvantages of different frameworks, and effectively control the time and cost of experimental analysis. The former can be regarded as a short-term composite forecasting platform, while the latter belongs to a data mining framework.","PeriodicalId":202840,"journal":{"name":"International Conference on Mathematics, Modeling and Computer Science","volume":"84 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Research on data mining and short-term power load forecasting based on Hadoop\",\"authors\":\"Xiannan Huang, Juhua Hong, Shicheng Huang, Linyao Zhang, Lin Liu, Siyu Yang, Weiwei Lin\",\"doi\":\"10.1117/12.2671337\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the rapid development of smart grid technology, the amount of power data information increases more and more, and the traditional centralized processing method has been unable to meet the requirements of power system operation. In order to better meet the storage and analysis goals of power data, researchers propose to use distributed power big data. On the basis of understanding the research status of data mining and load prediction algorithms, this paper focuses on the data composite characteristics and change rules of power AI competition according to the big data platform, and constructs the user clustering model with Mahout as the core and the multi-algorithm fusion prediction model with Spark as the core. Combined with the final research results, it is shown that Spark and Mahout, two frameworks in the Hadoop ecosystem, fully consider the advantages and disadvantages of different frameworks, and effectively control the time and cost of experimental analysis. The former can be regarded as a short-term composite forecasting platform, while the latter belongs to a data mining framework.\",\"PeriodicalId\":202840,\"journal\":{\"name\":\"International Conference on Mathematics, Modeling and Computer Science\",\"volume\":\"84 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-06-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Conference on Mathematics, Modeling and Computer Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1117/12.2671337\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Mathematics, Modeling and Computer Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1117/12.2671337","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Research on data mining and short-term power load forecasting based on Hadoop
With the rapid development of smart grid technology, the amount of power data information increases more and more, and the traditional centralized processing method has been unable to meet the requirements of power system operation. In order to better meet the storage and analysis goals of power data, researchers propose to use distributed power big data. On the basis of understanding the research status of data mining and load prediction algorithms, this paper focuses on the data composite characteristics and change rules of power AI competition according to the big data platform, and constructs the user clustering model with Mahout as the core and the multi-algorithm fusion prediction model with Spark as the core. Combined with the final research results, it is shown that Spark and Mahout, two frameworks in the Hadoop ecosystem, fully consider the advantages and disadvantages of different frameworks, and effectively control the time and cost of experimental analysis. The former can be regarded as a short-term composite forecasting platform, while the latter belongs to a data mining framework.