{"title":"因果干预是大型语言模型进行时空预测的必要条件","authors":"Shijie Li;He Li;Xiaojing Li;Yong Xu;Zhenhong Lin;Huaiguang Jiang","doi":"10.1109/TCYB.2025.3569333","DOIUrl":null,"url":null,"abstract":"Spatio-temporal forecasting plays a crucial role in the dynamic perception of smart cities, such as traffic flow prediction, renewable energy forecasting, and load prediction. Its objective is to understand the patterns of spatio-temporal changes under the interaction of various factors. Accurate spatio-temporal forecasting relies on sufficient high-quality data and powerful models. However, in reality, data is often sparse. In such cases, while adaptive graphs and large language models (LLMs) can maintain performance, they face issues of spatial spurious associations and hallucinations, respectively. These issues hinder the ability of the model to learn and infer cross spatio-temporal and cross-scale features effectively. To address this, we propose a novel model termed spatio-temporal causal intervention large language model (STCInterLLM). This model employs a newly designed causal intervention encoder to update spatial spurious correlations in the spatio-temporal adaptive graph. Subsequently, the novel chain-of-action prompting text is utilized to enforce the decomposition of the prediction process, thereby enhancing the causal representation of features while mitigating hallucinations in LLMs. Finally, a lightweight marker alignment module ensures the consistency between the encoder, prompting text, and LLM, enabling accurate forecasting of distinct scale spatio-temporal evolution patterns. Extensive experiments conducted on power distribution systems integrated with renewable energy sources and transportation systems encompassing diverse types of data, demonstrate that the proposed STCInterLLM consistently achieves state-of-the-art performance across significantly varied scenarios. Codes are available at <uri>https://github.com/lishijie15/STCInterLLM</uri>.","PeriodicalId":13112,"journal":{"name":"IEEE Transactions on Cybernetics","volume":"55 8","pages":"3825-3837"},"PeriodicalIF":10.5000,"publicationDate":"2025-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Causal Intervention Is What Large Language Models Need for Spatio-Temporal Forecasting\",\"authors\":\"Shijie Li;He Li;Xiaojing Li;Yong Xu;Zhenhong Lin;Huaiguang Jiang\",\"doi\":\"10.1109/TCYB.2025.3569333\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Spatio-temporal forecasting plays a crucial role in the dynamic perception of smart cities, such as traffic flow prediction, renewable energy forecasting, and load prediction. Its objective is to understand the patterns of spatio-temporal changes under the interaction of various factors. Accurate spatio-temporal forecasting relies on sufficient high-quality data and powerful models. However, in reality, data is often sparse. In such cases, while adaptive graphs and large language models (LLMs) can maintain performance, they face issues of spatial spurious associations and hallucinations, respectively. These issues hinder the ability of the model to learn and infer cross spatio-temporal and cross-scale features effectively. To address this, we propose a novel model termed spatio-temporal causal intervention large language model (STCInterLLM). This model employs a newly designed causal intervention encoder to update spatial spurious correlations in the spatio-temporal adaptive graph. Subsequently, the novel chain-of-action prompting text is utilized to enforce the decomposition of the prediction process, thereby enhancing the causal representation of features while mitigating hallucinations in LLMs. Finally, a lightweight marker alignment module ensures the consistency between the encoder, prompting text, and LLM, enabling accurate forecasting of distinct scale spatio-temporal evolution patterns. Extensive experiments conducted on power distribution systems integrated with renewable energy sources and transportation systems encompassing diverse types of data, demonstrate that the proposed STCInterLLM consistently achieves state-of-the-art performance across significantly varied scenarios. Codes are available at <uri>https://github.com/lishijie15/STCInterLLM</uri>.\",\"PeriodicalId\":13112,\"journal\":{\"name\":\"IEEE Transactions on Cybernetics\",\"volume\":\"55 8\",\"pages\":\"3825-3837\"},\"PeriodicalIF\":10.5000,\"publicationDate\":\"2025-03-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Cybernetics\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11017752/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Cybernetics","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11017752/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
Causal Intervention Is What Large Language Models Need for Spatio-Temporal Forecasting
Spatio-temporal forecasting plays a crucial role in the dynamic perception of smart cities, such as traffic flow prediction, renewable energy forecasting, and load prediction. Its objective is to understand the patterns of spatio-temporal changes under the interaction of various factors. Accurate spatio-temporal forecasting relies on sufficient high-quality data and powerful models. However, in reality, data is often sparse. In such cases, while adaptive graphs and large language models (LLMs) can maintain performance, they face issues of spatial spurious associations and hallucinations, respectively. These issues hinder the ability of the model to learn and infer cross spatio-temporal and cross-scale features effectively. To address this, we propose a novel model termed spatio-temporal causal intervention large language model (STCInterLLM). This model employs a newly designed causal intervention encoder to update spatial spurious correlations in the spatio-temporal adaptive graph. Subsequently, the novel chain-of-action prompting text is utilized to enforce the decomposition of the prediction process, thereby enhancing the causal representation of features while mitigating hallucinations in LLMs. Finally, a lightweight marker alignment module ensures the consistency between the encoder, prompting text, and LLM, enabling accurate forecasting of distinct scale spatio-temporal evolution patterns. Extensive experiments conducted on power distribution systems integrated with renewable energy sources and transportation systems encompassing diverse types of data, demonstrate that the proposed STCInterLLM consistently achieves state-of-the-art performance across significantly varied scenarios. Codes are available at https://github.com/lishijie15/STCInterLLM.
期刊介绍:
The scope of the IEEE Transactions on Cybernetics includes computational approaches to the field of cybernetics. Specifically, the transactions welcomes papers on communication and control across machines or machine, human, and organizations. The scope includes such areas as computational intelligence, computer vision, neural networks, genetic algorithms, machine learning, fuzzy systems, cognitive systems, decision making, and robotics, to the extent that they contribute to the theme of cybernetics or demonstrate an application of cybernetics principles.