因果干预是大型语言模型进行时空预测的必要条件

IF 10.5 1区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS
Shijie Li;He Li;Xiaojing Li;Yong Xu;Zhenhong Lin;Huaiguang Jiang
{"title":"因果干预是大型语言模型进行时空预测的必要条件","authors":"Shijie Li;He Li;Xiaojing Li;Yong Xu;Zhenhong Lin;Huaiguang Jiang","doi":"10.1109/TCYB.2025.3569333","DOIUrl":null,"url":null,"abstract":"Spatio-temporal forecasting plays a crucial role in the dynamic perception of smart cities, such as traffic flow prediction, renewable energy forecasting, and load prediction. Its objective is to understand the patterns of spatio-temporal changes under the interaction of various factors. Accurate spatio-temporal forecasting relies on sufficient high-quality data and powerful models. However, in reality, data is often sparse. In such cases, while adaptive graphs and large language models (LLMs) can maintain performance, they face issues of spatial spurious associations and hallucinations, respectively. These issues hinder the ability of the model to learn and infer cross spatio-temporal and cross-scale features effectively. To address this, we propose a novel model termed spatio-temporal causal intervention large language model (STCInterLLM). This model employs a newly designed causal intervention encoder to update spatial spurious correlations in the spatio-temporal adaptive graph. Subsequently, the novel chain-of-action prompting text is utilized to enforce the decomposition of the prediction process, thereby enhancing the causal representation of features while mitigating hallucinations in LLMs. Finally, a lightweight marker alignment module ensures the consistency between the encoder, prompting text, and LLM, enabling accurate forecasting of distinct scale spatio-temporal evolution patterns. Extensive experiments conducted on power distribution systems integrated with renewable energy sources and transportation systems encompassing diverse types of data, demonstrate that the proposed STCInterLLM consistently achieves state-of-the-art performance across significantly varied scenarios. Codes are available at <uri>https://github.com/lishijie15/STCInterLLM</uri>.","PeriodicalId":13112,"journal":{"name":"IEEE Transactions on Cybernetics","volume":"55 8","pages":"3825-3837"},"PeriodicalIF":10.5000,"publicationDate":"2025-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Causal Intervention Is What Large Language Models Need for Spatio-Temporal Forecasting\",\"authors\":\"Shijie Li;He Li;Xiaojing Li;Yong Xu;Zhenhong Lin;Huaiguang Jiang\",\"doi\":\"10.1109/TCYB.2025.3569333\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Spatio-temporal forecasting plays a crucial role in the dynamic perception of smart cities, such as traffic flow prediction, renewable energy forecasting, and load prediction. Its objective is to understand the patterns of spatio-temporal changes under the interaction of various factors. Accurate spatio-temporal forecasting relies on sufficient high-quality data and powerful models. However, in reality, data is often sparse. In such cases, while adaptive graphs and large language models (LLMs) can maintain performance, they face issues of spatial spurious associations and hallucinations, respectively. These issues hinder the ability of the model to learn and infer cross spatio-temporal and cross-scale features effectively. To address this, we propose a novel model termed spatio-temporal causal intervention large language model (STCInterLLM). This model employs a newly designed causal intervention encoder to update spatial spurious correlations in the spatio-temporal adaptive graph. Subsequently, the novel chain-of-action prompting text is utilized to enforce the decomposition of the prediction process, thereby enhancing the causal representation of features while mitigating hallucinations in LLMs. Finally, a lightweight marker alignment module ensures the consistency between the encoder, prompting text, and LLM, enabling accurate forecasting of distinct scale spatio-temporal evolution patterns. Extensive experiments conducted on power distribution systems integrated with renewable energy sources and transportation systems encompassing diverse types of data, demonstrate that the proposed STCInterLLM consistently achieves state-of-the-art performance across significantly varied scenarios. Codes are available at <uri>https://github.com/lishijie15/STCInterLLM</uri>.\",\"PeriodicalId\":13112,\"journal\":{\"name\":\"IEEE Transactions on Cybernetics\",\"volume\":\"55 8\",\"pages\":\"3825-3837\"},\"PeriodicalIF\":10.5000,\"publicationDate\":\"2025-03-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Cybernetics\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11017752/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Cybernetics","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11017752/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0

摘要

时空预测在智慧城市的动态感知中起着至关重要的作用,如交通流量预测、可再生能源预测、负荷预测等。其目的是了解在各种因素相互作用下的时空变化模式。准确的时空预测依赖于充足的高质量数据和强大的模型。然而,在现实中,数据通常是稀疏的。在这种情况下,虽然自适应图和大型语言模型(llm)可以保持性能,但它们分别面临空间虚假关联和幻觉的问题。这些问题阻碍了模型有效地学习和推断跨时空和跨尺度特征的能力。为了解决这个问题,我们提出了一个新的模型,称为时空因果干预大语言模型(STCInterLLM)。该模型采用新设计的因果干预编码器来更新时空自适应图中的空间伪相关。随后,利用新颖的动作链提示文本来强制分解预测过程,从而增强特征的因果表示,同时减轻llm中的幻觉。最后,轻量级的标记对齐模块确保了编码器、提示文本和LLM之间的一致性,从而能够准确预测不同尺度的时空演变模式。在包含不同类型数据的可再生能源和运输系统的配电系统中进行的大量实验表明,所提出的STCInterLLM在不同的场景中始终能够实现最先进的性能。代码可在https://github.com/lishijie15/STCInterLLM上获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Causal Intervention Is What Large Language Models Need for Spatio-Temporal Forecasting
Spatio-temporal forecasting plays a crucial role in the dynamic perception of smart cities, such as traffic flow prediction, renewable energy forecasting, and load prediction. Its objective is to understand the patterns of spatio-temporal changes under the interaction of various factors. Accurate spatio-temporal forecasting relies on sufficient high-quality data and powerful models. However, in reality, data is often sparse. In such cases, while adaptive graphs and large language models (LLMs) can maintain performance, they face issues of spatial spurious associations and hallucinations, respectively. These issues hinder the ability of the model to learn and infer cross spatio-temporal and cross-scale features effectively. To address this, we propose a novel model termed spatio-temporal causal intervention large language model (STCInterLLM). This model employs a newly designed causal intervention encoder to update spatial spurious correlations in the spatio-temporal adaptive graph. Subsequently, the novel chain-of-action prompting text is utilized to enforce the decomposition of the prediction process, thereby enhancing the causal representation of features while mitigating hallucinations in LLMs. Finally, a lightweight marker alignment module ensures the consistency between the encoder, prompting text, and LLM, enabling accurate forecasting of distinct scale spatio-temporal evolution patterns. Extensive experiments conducted on power distribution systems integrated with renewable energy sources and transportation systems encompassing diverse types of data, demonstrate that the proposed STCInterLLM consistently achieves state-of-the-art performance across significantly varied scenarios. Codes are available at https://github.com/lishijie15/STCInterLLM.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
IEEE Transactions on Cybernetics
IEEE Transactions on Cybernetics COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-COMPUTER SCIENCE, CYBERNETICS
CiteScore
25.40
自引率
11.00%
发文量
1869
期刊介绍: The scope of the IEEE Transactions on Cybernetics includes computational approaches to the field of cybernetics. Specifically, the transactions welcomes papers on communication and control across machines or machine, human, and organizations. The scope includes such areas as computational intelligence, computer vision, neural networks, genetic algorithms, machine learning, fuzzy systems, cognitive systems, decision making, and robotics, to the extent that they contribute to the theme of cybernetics or demonstrate an application of cybernetics principles.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信