Sequential Causal Effect Estimation by Jointly Modeling the Unmeasured Confounders and Instrumental Variables

IF 8.9 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE Transactions on Knowledge and Data Engineering Pub Date : 2024-12-04 DOI:10.1109/TKDE.2024.3510734

Zexu Sun;Bowei He;Shiqi Shen;Zhipeng Wang;Zhi Gong;Chen Ma;Qi Qi;Xu Chen

{"title":"Sequential Causal Effect Estimation by Jointly Modeling the Unmeasured Confounders and Instrumental Variables","authors":"Zexu Sun;Bowei He;Shiqi Shen;Zhipeng Wang;Zhi Gong;Chen Ma;Qi Qi;Xu Chen","doi":"10.1109/TKDE.2024.3510734","DOIUrl":null,"url":null,"abstract":"Sequential causal effect estimation has recently attracted increasing attention from research and industry. While the existing models have achieved many successes, there are still many limitations. Existing models usually assume the causal graphs to be sufficient, i.e., there are no latent factors, such as the unmeasured confounders and instrumental variables. However, in real-world scenarios, it is hard to record all of the factors in the observational data, which makes the causally sufficient assumptions not hold. Moreover, existing models mainly focus on discrete treatments rather than continuous ones. To alleviate the above problems, in this paper, we propose a novel \n<bold>C\nontinous \n<bold>C\nausal \n<bold>M\nodel by explicitly capturing the \n<bold>L\natent \n<bold>F\nactors (called \n<bold>C<inline-formula><tex-math>$^{2}$</tex-math><alternatives><mml:math><mml:msup><mml:mrow/><mml:mn>2</mml:mn></mml:msup></mml:math><inline-graphic></alternatives></inline-formula>M-LF\n for short). Specifically, we define a sequential causal graph by simultaneously considering the unmeasured confounders and instrumental variables. Second, we describe the independence that should be satisfied among different variables from the mutual information perspective and further propose our learning objective. Then, we reweight different samples in the continuous treatment space to optimize our model unbiasedly. Beyond the above designs, we also theoretically analyze our model’s causal identifiability and unbiasedness. Finally, we conduct extensive experiments on both simulation and real-world datasets to demonstrate the effectiveness of our proposed model.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 2","pages":"910-922"},"PeriodicalIF":8.9000,"publicationDate":"2024-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Knowledge and Data Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10777296/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Sequential causal effect estimation has recently attracted increasing attention from research and industry. While the existing models have achieved many successes, there are still many limitations. Existing models usually assume the causal graphs to be sufficient, i.e., there are no latent factors, such as the unmeasured confounders and instrumental variables. However, in real-world scenarios, it is hard to record all of the factors in the observational data, which makes the causally sufficient assumptions not hold. Moreover, existing models mainly focus on discrete treatments rather than continuous ones. To alleviate the above problems, in this paper, we propose a novel C ontinous C ausal M odel by explicitly capturing the L atent F actors (called C

$^{2}$

M-LF for short). Specifically, we define a sequential causal graph by simultaneously considering the unmeasured confounders and instrumental variables. Second, we describe the independence that should be satisfied among different variables from the mutual information perspective and further propose our learning objective. Then, we reweight different samples in the continuous treatment space to optimize our model unbiasedly. Beyond the above designs, we also theoretically analyze our model’s causal identifiability and unbiasedness. Finally, we conduct extensive experiments on both simulation and real-world datasets to demonstrate the effectiveness of our proposed model.

查看原文本刊更多论文

用未测量混杂因素和工具变量联合建模的顺序因果效应估计

序列因果效应估计近年来越来越受到学术界和工业界的关注。虽然现有的模型取得了许多成功，但仍然存在许多局限性。现有模型通常假设因果图是充分的，即不存在潜在因素，如未测量的混杂因素和工具变量。然而，在现实世界中，很难记录观测数据中的所有因素，这使得因果充分的假设不成立。此外，现有的模型主要侧重于离散处理，而不是连续处理。为了缓解上述问题，本文通过明确捕获潜在因素（简称C$^{2}$ 200 - lf），提出了一种新的连续因果模型。具体来说，我们通过同时考虑未测量的混杂因素和工具变量来定义顺序因果图。其次，我们从互信息的角度描述了不同变量之间应满足的独立性，并进一步提出了我们的学习目标。然后，我们在连续处理空间中对不同样本进行重加权，以无偏地优化我们的模型。除了上述设计，我们还从理论上分析了我们的模型的因果可识别性和无偏性。最后，我们在模拟和现实世界的数据集上进行了广泛的实验，以证明我们提出的模型的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Knowledge and Data Engineering 工程技术-工程：电子与电气

CiteScore

11.70

自引率

3.40%

发文量

515

审稿时长

6 months

期刊介绍： The IEEE Transactions on Knowledge and Data Engineering encompasses knowledge and data engineering aspects within computer science, artificial intelligence, electrical engineering, computer engineering, and related fields. It provides an interdisciplinary platform for disseminating new developments in knowledge and data engineering and explores the practicality of these concepts in both hardware and software. Specific areas covered include knowledge-based and expert systems, AI techniques for knowledge and data management, tools, and methodologies, distributed processing, real-time systems, architectures, data management practices, database design, query languages, security, fault tolerance, statistical databases, algorithms, performance evaluation, and applications.