STLLM-GAN: Spatio-temporal LLM Generative Adversarial Network for PM2.5 prediction

IF 7.5 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Expert Systems with Applications Pub Date : 2025-06-21 DOI:10.1016/j.eswa.2025.128250

Changkui Yin , Yingchi Mao , Liren Deng , Meng Chen , Yi Rong , Xiaoming He , Xiaofeng Zhou

{"title":"STLLM-GAN: Spatio-temporal LLM Generative Adversarial Network for PM2.5 prediction","authors":"Changkui Yin , Yingchi Mao , Liren Deng , Meng Chen , Yi Rong , Xiaoming He , Xiaofeng Zhou","doi":"10.1016/j.eswa.2025.128250","DOIUrl":null,"url":null,"abstract":"<div><div>Accurate <span><math><mrow><mi>P</mi><msub><mi>M</mi><mrow><mn>2.5</mn></mrow></msub></mrow></math></span> prediction plays an important role in climate change mitigation and environmental protection. As an emerging artificial intelligence technique, Large Language Models (LLMs) have exhibited powerful data processing and adaptive feature learning capabilities, thus being widely applied in Time Series Prediction (TSP). Unfortunately, the current LLM-based TSP models pose difficulties in accurate <span><math><mrow><mi>P</mi><msub><mi>M</mi><mrow><mn>2.5</mn></mrow></msub></mrow></math></span> prediction due to two factors: 1) they neglect or fail to fully extract spatio-temporal dependencies, and 2) these LLM-based TSP models solely rely on the supervised training, failing to learn real data’s distribution. Jointly considering these factors, in this paper, we present a novel <span><math><mrow><mi>P</mi><msub><mi>M</mi><mrow><mn>2.5</mn></mrow></msub></mrow></math></span> prediction framework namely Spatio-Temporal LLM Generative Adversarial Network (STLLM-GAN). In detail, to capture spatio-temporal dependencies, Spatio-Temporal Large Language Model (STLLM) is first developed, containing a Spatio-Temporal Module (STM) and an LLM-enabled Inference Module (LLMIM). For the purpose of optimizing STLLM’s training, we design an adversarial training scheme using Generative Adversarial Network. The scheme incorporates un- and supervised training through a jointly beneficial manner. The unsupervised training seeks to learn real data’s distribution, while the supervised training strives to align the actual values with estimations by minimizing Mean Squared Error (MSE) function. We carry out extensive experiments on two real-world air pollutant concentration datasets, covering Shanghai city and Beijing city, respectively. The experimental results prove that STLLM-GAN is superior to advanced benchmarks in prediction performance.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"292 ","pages":"Article 128250"},"PeriodicalIF":7.5000,"publicationDate":"2025-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S095741742501869X","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Accurate

P M_{2.5}

prediction plays an important role in climate change mitigation and environmental protection. As an emerging artificial intelligence technique, Large Language Models (LLMs) have exhibited powerful data processing and adaptive feature learning capabilities, thus being widely applied in Time Series Prediction (TSP). Unfortunately, the current LLM-based TSP models pose difficulties in accurate

P M_{2.5}

prediction due to two factors: 1) they neglect or fail to fully extract spatio-temporal dependencies, and 2) these LLM-based TSP models solely rely on the supervised training, failing to learn real data’s distribution. Jointly considering these factors, in this paper, we present a novel

P M_{2.5}

prediction framework namely Spatio-Temporal LLM Generative Adversarial Network (STLLM-GAN). In detail, to capture spatio-temporal dependencies, Spatio-Temporal Large Language Model (STLLM) is first developed, containing a Spatio-Temporal Module (STM) and an LLM-enabled Inference Module (LLMIM). For the purpose of optimizing STLLM’s training, we design an adversarial training scheme using Generative Adversarial Network. The scheme incorporates un- and supervised training through a jointly beneficial manner. The unsupervised training seeks to learn real data’s distribution, while the supervised training strives to align the actual values with estimations by minimizing Mean Squared Error (MSE) function. We carry out extensive experiments on two real-world air pollutant concentration datasets, covering Shanghai city and Beijing city, respectively. The experimental results prove that STLLM-GAN is superior to advanced benchmarks in prediction performance.

查看原文本刊更多论文

面向PM2.5预测的时空LLM生成对抗网络

准确预测PM2.5对减缓气候变化和保护环境具有重要作用。大语言模型（Large Language Models, LLMs）作为一种新兴的人工智能技术，具有强大的数据处理和自适应特征学习能力，在时间序列预测（Time Series Prediction， TSP）中得到了广泛的应用。遗憾的是，目前基于llm的TSP模型在PM2.5的准确预测中存在两个困难：1)忽略或未能充分提取时空依赖关系；2)这些基于llm的TSP模型仅依赖于监督训练，未能学习到真实数据的分布。综合考虑这些因素，本文提出了一种新的PM2.5预测框架，即时空LLM生成对抗网络（STLLM-GAN）。具体而言，为了捕获时空依赖关系，首先开发了时空大语言模型（STLLM），该模型包含一个时空模块（STM）和一个支持时空大语言模型的推理模块（LLMIM）。为了优化STLLM的训练，我们设计了一个基于生成式对抗网络的对抗训练方案。该计划通过共同受益的方式将无监督培训和监督培训结合起来。无监督训练旨在学习真实数据的分布，而有监督训练通过最小化均方误差（MSE）函数来努力使实际值与估计值保持一致。我们在两个真实世界的空气污染物浓度数据集上进行了广泛的实验，分别覆盖了上海市和北京市。实验结果证明，STLLM-GAN在预测性能上优于高级基准。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Expert Systems with Applications 工程技术-工程：电子与电气

CiteScore

13.80

自引率

10.60%

发文量

2045

审稿时长

8.7 months

期刊介绍： Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.