一种基于casda驱动的无监督聚类热负荷预测新方法

IF 6.7 2区工程技术 Q1 CONSTRUCTION & BUILDING TECHNOLOGY

Journal of building engineering Pub Date : 2025-07-11 DOI:10.1016/j.jobe.2025.113457

Qi Liu, Hongjuan Hou, Zekai Zhou, Lengge Si, Xi Wang, Yan Jia, Eric Hu

{"title":"一种基于casda驱动的无监督聚类热负荷预测新方法","authors":"Qi Liu, Hongjuan Hou, Zekai Zhou, Lengge Si, Xi Wang, Yan Jia, Eric Hu","doi":"10.1016/j.jobe.2025.113457","DOIUrl":null,"url":null,"abstract":"Accurate heat load forecasting is essential for improving the operational efficiency and intelligent management of district heating systems, particularly in addressing the mismatch between heat supply and demand caused by spatiotemporal variability. While Artificial Neural Network (ANN) have shown promise in this domain, their performance is highly sensitive to the quality and volume of training data. Given the limitations of data availability and the cost of large-scale data acquisition, enhancing data quality through preprocessing has become a practical alternative. This study proposes an improved data preprocessing strategy, termed Cluster Analysis based on Similar Day Approach (CASDA), to enhance ANN training for heat load forecasting. Unlike traditional Similar Day Approach (SDA) that relies primarily on weather similarity, CASDA clusters historical data based on heat load patterns, providing a more representative training dataset. Clusters are labeled by dominant weather types and used to train distinct ANN models. For forecasting, the most appropriate model is selected by evaluating the similarity between the forecast day and each cluster using a combination of Grey Relational Analysis (GRA) and Pearson correlation. A case study on a district heating substation in Beijing demonstrates that CASDA significantly improves forecasting accuracy across multiple ANN architectures, including Recurrent Neural Network (RNN), Convolutional Neural Network (CNN) and Transformers, and reduces MAPE by 15.2% (ConvGRU-GRU) compared with traditional SDA methods. Notably, the Transformer-based model (CMT) achieved the best performance, with an average validation R<ce:sup loc=\"post\">2</ce:sup> of 0.77, outperforming both traditional SDA models and models trained on unprocessed data. Moreover, CASDA offers reduced modeling costs by optimizing data utilization.","PeriodicalId":15064,"journal":{"name":"Journal of building engineering","volume":"109 1","pages":""},"PeriodicalIF":6.7000,"publicationDate":"2025-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Novel Heat Load Forecasting Method Based on CASDA-Driven Unsupervised Clustering\",\"authors\":\"Qi Liu, Hongjuan Hou, Zekai Zhou, Lengge Si, Xi Wang, Yan Jia, Eric Hu\",\"doi\":\"10.1016/j.jobe.2025.113457\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Accurate heat load forecasting is essential for improving the operational efficiency and intelligent management of district heating systems, particularly in addressing the mismatch between heat supply and demand caused by spatiotemporal variability. While Artificial Neural Network (ANN) have shown promise in this domain, their performance is highly sensitive to the quality and volume of training data. Given the limitations of data availability and the cost of large-scale data acquisition, enhancing data quality through preprocessing has become a practical alternative. This study proposes an improved data preprocessing strategy, termed Cluster Analysis based on Similar Day Approach (CASDA), to enhance ANN training for heat load forecasting. Unlike traditional Similar Day Approach (SDA) that relies primarily on weather similarity, CASDA clusters historical data based on heat load patterns, providing a more representative training dataset. Clusters are labeled by dominant weather types and used to train distinct ANN models. For forecasting, the most appropriate model is selected by evaluating the similarity between the forecast day and each cluster using a combination of Grey Relational Analysis (GRA) and Pearson correlation. A case study on a district heating substation in Beijing demonstrates that CASDA significantly improves forecasting accuracy across multiple ANN architectures, including Recurrent Neural Network (RNN), Convolutional Neural Network (CNN) and Transformers, and reduces MAPE by 15.2% (ConvGRU-GRU) compared with traditional SDA methods. Notably, the Transformer-based model (CMT) achieved the best performance, with an average validation R<ce:sup loc=\\\"post\\\">2</ce:sup> of 0.77, outperforming both traditional SDA models and models trained on unprocessed data. Moreover, CASDA offers reduced modeling costs by optimizing data utilization.\",\"PeriodicalId\":15064,\"journal\":{\"name\":\"Journal of building engineering\",\"volume\":\"109 1\",\"pages\":\"\"},\"PeriodicalIF\":6.7000,\"publicationDate\":\"2025-07-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of building engineering\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1016/j.jobe.2025.113457\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CONSTRUCTION & BUILDING TECHNOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of building engineering","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1016/j.jobe.2025.113457","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CONSTRUCTION & BUILDING TECHNOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

准确的热负荷预测对于提高区域供热系统的运行效率和智能管理至关重要，特别是对于解决由时空变化引起的供热供需不匹配问题。虽然人工神经网络（ANN）在这一领域显示出很大的潜力，但其性能对训练数据的质量和数量高度敏感。考虑到数据可用性的限制和大规模数据采集的成本，通过预处理提高数据质量已成为一种实用的替代方案。本研究提出了一种改进的数据预处理策略，称为基于相似日方法的聚类分析（CASDA），以增强人工神经网络热负荷预测的训练。与主要依赖天气相似性的传统相似日方法（SDA）不同，CASDA基于热负荷模式对历史数据进行聚类，提供更具代表性的训练数据集。聚类由主要天气类型标记，并用于训练不同的人工神经网络模型。在预测时，结合灰色关联分析（GRA）和Pearson相关性，对预测日与各聚类的相似性进行评价，选择最合适的模型。通过对北京某区域供热站的实例研究表明，CASDA显著提高了包括循环神经网络（RNN）、卷积神经网络（CNN）和变压器在内的多种神经网络架构的预测精度，与传统的SDA方法相比，MAPE （convru - gru）降低了15.2%。值得注意的是，基于transformer的模型（CMT）获得了最好的性能，其平均验证R2为0.77，优于传统的SDA模型和未处理数据训练的模型。此外，CASDA通过优化数据利用率降低了建模成本。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Novel Heat Load Forecasting Method Based on CASDA-Driven Unsupervised Clustering

Accurate heat load forecasting is essential for improving the operational efficiency and intelligent management of district heating systems, particularly in addressing the mismatch between heat supply and demand caused by spatiotemporal variability. While Artificial Neural Network (ANN) have shown promise in this domain, their performance is highly sensitive to the quality and volume of training data. Given the limitations of data availability and the cost of large-scale data acquisition, enhancing data quality through preprocessing has become a practical alternative. This study proposes an improved data preprocessing strategy, termed Cluster Analysis based on Similar Day Approach (CASDA), to enhance ANN training for heat load forecasting. Unlike traditional Similar Day Approach (SDA) that relies primarily on weather similarity, CASDA clusters historical data based on heat load patterns, providing a more representative training dataset. Clusters are labeled by dominant weather types and used to train distinct ANN models. For forecasting, the most appropriate model is selected by evaluating the similarity between the forecast day and each cluster using a combination of Grey Relational Analysis (GRA) and Pearson correlation. A case study on a district heating substation in Beijing demonstrates that CASDA significantly improves forecasting accuracy across multiple ANN architectures, including Recurrent Neural Network (RNN), Convolutional Neural Network (CNN) and Transformers, and reduces MAPE by 15.2% (ConvGRU-GRU) compared with traditional SDA methods. Notably, the Transformer-based model (CMT) achieved the best performance, with an average validation R2 of 0.77, outperforming both traditional SDA models and models trained on unprocessed data. Moreover, CASDA offers reduced modeling costs by optimizing data utilization.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of building engineering Engineering-Civil and Structural Engineering

CiteScore

10.00

自引率

12.50%

发文量

1901

审稿时长

35 days

期刊介绍： The Journal of Building Engineering is an interdisciplinary journal that covers all aspects of science and technology concerned with the whole life cycle of the built environment; from the design phase through to construction, operation, performance, maintenance and its deterioration.