通过基于 GAN 的因果信息学习增强预测建模能力

IF 7.2 4区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Jinwei Zeng, Guozhen Zhang, Jian Yuan, Yong Li, Depeng Jin
{"title":"通过基于 GAN 的因果信息学习增强预测建模能力","authors":"Jinwei Zeng, Guozhen Zhang, Jian Yuan, Yong Li, Depeng Jin","doi":"10.1145/3652610","DOIUrl":null,"url":null,"abstract":"<p>Generally speaking, we can easily specify many causal relationships in the prediction tasks of ubiquitous computing, such as human activity prediction, mobility prediction, and health prediction. However, most of the existing methods in these fields failed to take advantage of this prior causal knowledge. They typically make predictions only based on correlations in the data, which hinders the prediction performance in real-world scenarios because a distribution shift between training data and testing data generally exists. To fill in this gap, we proposed a <underline>G</underline>AN-based <underline>C</underline>ausal <underline>I</underline>nformation <underline>L</underline>earning prediction framework (GCIL), which can effectively leverage causal information to improve the prediction performance of existing ubiquitous computing deep learning models. Specifically, faced with a unique challenge that the treatment variable, referring to the intervention that influences the target in a causal relationship, is generally continuous in ubiquitous computing, the framework employs a representation learning approach with a GAN-based deep learning model. By projecting all variables except the treatment into a latent space, it effectively minimizes confounding bias and leverages the learned latent representation for accurate predictions. In this way, it deals with the continuous treatment challenge, and in the meantime, it can be easily integrated with existing deep learning models to lift their prediction performance in practical scenarios with causal information. Extensive experiments on two large-scale real-world datasets demonstrate its superior performance over multiple state-of-the-art baselines. We also propose an analytical framework together with extensive experiments to empirically show that our framework achieves better performance gain under two conditions: when the distribution differences between the training data and the testing data are more significant and when the treatment effects are larger. Overall, this work suggests that learning causal information is a promising way to improve the prediction performance of ubiquitous computing tasks. We open both our dataset and code<sup>1</sup> and call for more research attention in this area.</p>","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":null,"pages":null},"PeriodicalIF":7.2000,"publicationDate":"2024-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Empowering Predictive Modeling by GAN-based Causal Information Learning\",\"authors\":\"Jinwei Zeng, Guozhen Zhang, Jian Yuan, Yong Li, Depeng Jin\",\"doi\":\"10.1145/3652610\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Generally speaking, we can easily specify many causal relationships in the prediction tasks of ubiquitous computing, such as human activity prediction, mobility prediction, and health prediction. However, most of the existing methods in these fields failed to take advantage of this prior causal knowledge. They typically make predictions only based on correlations in the data, which hinders the prediction performance in real-world scenarios because a distribution shift between training data and testing data generally exists. To fill in this gap, we proposed a <underline>G</underline>AN-based <underline>C</underline>ausal <underline>I</underline>nformation <underline>L</underline>earning prediction framework (GCIL), which can effectively leverage causal information to improve the prediction performance of existing ubiquitous computing deep learning models. Specifically, faced with a unique challenge that the treatment variable, referring to the intervention that influences the target in a causal relationship, is generally continuous in ubiquitous computing, the framework employs a representation learning approach with a GAN-based deep learning model. By projecting all variables except the treatment into a latent space, it effectively minimizes confounding bias and leverages the learned latent representation for accurate predictions. In this way, it deals with the continuous treatment challenge, and in the meantime, it can be easily integrated with existing deep learning models to lift their prediction performance in practical scenarios with causal information. Extensive experiments on two large-scale real-world datasets demonstrate its superior performance over multiple state-of-the-art baselines. We also propose an analytical framework together with extensive experiments to empirically show that our framework achieves better performance gain under two conditions: when the distribution differences between the training data and the testing data are more significant and when the treatment effects are larger. Overall, this work suggests that learning causal information is a promising way to improve the prediction performance of ubiquitous computing tasks. We open both our dataset and code<sup>1</sup> and call for more research attention in this area.</p>\",\"PeriodicalId\":48967,\"journal\":{\"name\":\"ACM Transactions on Intelligent Systems and Technology\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":7.2000,\"publicationDate\":\"2024-03-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Transactions on Intelligent Systems and Technology\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1145/3652610\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Intelligent Systems and Technology","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3652610","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

一般来说,在泛在计算的预测任务中,如人类活动预测、移动性预测和健康预测,我们可以很容易地指定许多因果关系。然而,这些领域的大多数现有方法都未能利用这些先验因果知识。它们通常只根据数据中的相关性进行预测,这就阻碍了实际场景中的预测性能,因为训练数据和测试数据之间通常存在分布偏移。为了填补这一空白,我们提出了基于 GAN 的因果信息学习预测框架(GCIL),它可以有效利用因果信息来提高现有泛在计算深度学习模型的预测性能。具体来说,面对处理变量(指在因果关系中影响目标的干预)在泛在计算中通常是连续的这一独特挑战,该框架采用了基于 GAN 深度学习模型的表示学习方法。通过将除治疗外的所有变量投射到一个潜空间,它能有效地减少混杂偏差,并利用学习到的潜表征进行准确预测。这样,它就能应对连续治疗的挑战,同时,它还能与现有的深度学习模型轻松集成,以提高其在具有因果信息的实际场景中的预测性能。在两个大规模真实世界数据集上进行的广泛实验证明,它的性能优于多个最先进的基线。我们还提出了一个分析框架,并通过大量实验实证表明,我们的框架在两种情况下取得了更好的性能提升:当训练数据和测试数据之间的分布差异更显著时,以及当处理效应更大时。总之,这项工作表明,学习因果信息是提高泛在计算任务预测性能的一种可行方法。我们将开放我们的数据集和代码1,并呼吁更多研究人员关注这一领域。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Empowering Predictive Modeling by GAN-based Causal Information Learning

Generally speaking, we can easily specify many causal relationships in the prediction tasks of ubiquitous computing, such as human activity prediction, mobility prediction, and health prediction. However, most of the existing methods in these fields failed to take advantage of this prior causal knowledge. They typically make predictions only based on correlations in the data, which hinders the prediction performance in real-world scenarios because a distribution shift between training data and testing data generally exists. To fill in this gap, we proposed a GAN-based Causal Information Learning prediction framework (GCIL), which can effectively leverage causal information to improve the prediction performance of existing ubiquitous computing deep learning models. Specifically, faced with a unique challenge that the treatment variable, referring to the intervention that influences the target in a causal relationship, is generally continuous in ubiquitous computing, the framework employs a representation learning approach with a GAN-based deep learning model. By projecting all variables except the treatment into a latent space, it effectively minimizes confounding bias and leverages the learned latent representation for accurate predictions. In this way, it deals with the continuous treatment challenge, and in the meantime, it can be easily integrated with existing deep learning models to lift their prediction performance in practical scenarios with causal information. Extensive experiments on two large-scale real-world datasets demonstrate its superior performance over multiple state-of-the-art baselines. We also propose an analytical framework together with extensive experiments to empirically show that our framework achieves better performance gain under two conditions: when the distribution differences between the training data and the testing data are more significant and when the treatment effects are larger. Overall, this work suggests that learning causal information is a promising way to improve the prediction performance of ubiquitous computing tasks. We open both our dataset and code1 and call for more research attention in this area.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
ACM Transactions on Intelligent Systems and Technology
ACM Transactions on Intelligent Systems and Technology COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-COMPUTER SCIENCE, INFORMATION SYSTEMS
CiteScore
9.30
自引率
2.00%
发文量
131
期刊介绍: ACM Transactions on Intelligent Systems and Technology is a scholarly journal that publishes the highest quality papers on intelligent systems, applicable algorithms and technology with a multi-disciplinary perspective. An intelligent system is one that uses artificial intelligence (AI) techniques to offer important services (e.g., as a component of a larger system) to allow integrated systems to perceive, reason, learn, and act intelligently in the real world. ACM TIST is published quarterly (six issues a year). Each issue has 8-11 regular papers, with around 20 published journal pages or 10,000 words per paper. Additional references, proofs, graphs or detailed experiment results can be submitted as a separate appendix, while excessively lengthy papers will be rejected automatically. Authors can include online-only appendices for additional content of their published papers and are encouraged to share their code and/or data with other readers.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信