Yicheng Pan, Yifan Zhang, Xinrui Jiang, Meng Ma, Ping Wang
{"title":"EffCause: Discover Dynamic Causal Relationships Efficiently from Time-Series","authors":"Yicheng Pan, Yifan Zhang, Xinrui Jiang, Meng Ma, Ping Wang","doi":"10.1145/3640818","DOIUrl":null,"url":null,"abstract":"<p>Since the proposal of Granger causality, many researchers have followed the idea and developed extensions to the original algorithm. The classic Granger causality test aims to detect the existence of the static causal relationship. Notably, a fundamental assumption underlying most previous studies is the stationarity of causality, which requires the causality between variables to keep stable. However, this study argues that it is easy to break in real-world scenarios. Fortunately, our paper presents an essential observation: if we consider a sufficiently short window when discovering the rapidly changing causalities, they will keep approximately static and thus can be detected using the static way correctly. In light of this, we develop EffCause, bringing dynamics into classic Granger causality. Specifically, to efficiently examine the causalities on different sliding window lengths, we design two optimization schemes in EffCause and demonstrate the advantage of EffCause through extensive experiments on both simulated and real-world datasets. The results validate that EffCause achieves state-of-the-art accuracy in continuous causal discovery tasks while achieving faster computation. Case studies from cloud system failure analysis and traffic flow monitoring show that EffCause effectively helps us understand real-world time-series data and solve practical problems.</p>","PeriodicalId":49249,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data","volume":"7 1","pages":""},"PeriodicalIF":4.0000,"publicationDate":"2024-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Knowledge Discovery from Data","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3640818","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Since the proposal of Granger causality, many researchers have followed the idea and developed extensions to the original algorithm. The classic Granger causality test aims to detect the existence of the static causal relationship. Notably, a fundamental assumption underlying most previous studies is the stationarity of causality, which requires the causality between variables to keep stable. However, this study argues that it is easy to break in real-world scenarios. Fortunately, our paper presents an essential observation: if we consider a sufficiently short window when discovering the rapidly changing causalities, they will keep approximately static and thus can be detected using the static way correctly. In light of this, we develop EffCause, bringing dynamics into classic Granger causality. Specifically, to efficiently examine the causalities on different sliding window lengths, we design two optimization schemes in EffCause and demonstrate the advantage of EffCause through extensive experiments on both simulated and real-world datasets. The results validate that EffCause achieves state-of-the-art accuracy in continuous causal discovery tasks while achieving faster computation. Case studies from cloud system failure analysis and traffic flow monitoring show that EffCause effectively helps us understand real-world time-series data and solve practical problems.
期刊介绍:
TKDD welcomes papers on a full range of research in the knowledge discovery and analysis of diverse forms of data. Such subjects include, but are not limited to: scalable and effective algorithms for data mining and big data analysis, mining brain networks, mining data streams, mining multi-media data, mining high-dimensional data, mining text, Web, and semi-structured data, mining spatial and temporal data, data mining for community generation, social network analysis, and graph structured data, security and privacy issues in data mining, visual, interactive and online data mining, pre-processing and post-processing for data mining, robust and scalable statistical methods, data mining languages, foundations of data mining, KDD framework and process, and novel applications and infrastructures exploiting data mining technology including massively parallel processing and cloud computing platforms. TKDD encourages papers that explore the above subjects in the context of large distributed networks of computers, parallel or multiprocessing computers, or new data devices. TKDD also encourages papers that describe emerging data mining applications that cannot be satisfied by the current data mining technology.