{"title":"Discovering Temporal Patterns for Event Sequence Clustering via Policy Mixture Model (Extended Abstract)","authors":"Weichang Wu, Junchi Yan, Xiaokang Yang, H. Zha","doi":"10.1109/ICDE55515.2023.00308","DOIUrl":null,"url":null,"abstract":"We focus on the problem of event sequence clustering with different temporal patterns from the view of Reinforcement Learning (RL), whereby the observed sequences are assumed to be generated from a mixture of latent policies. We propose an Expectation-Maximization (EM) based algorithm to cluster the sequences with different temporal patterns into the underlying policies while simultaneously learning each of the policy model, in E-step estimating the cluster labels for each sequence, in M-step learning the respective policy. For each policy learning, we resort to Inverse Reinforcement Learning (IRL) by decomposing the observed sequence into states (hidden embedding of event history) and actions (time interval to next event) in order to learn a reward function. Experiments on synthetic and real-world datasets show the efficacy of our method against the state-of-the-arts.","PeriodicalId":434744,"journal":{"name":"2023 IEEE 39th International Conference on Data Engineering (ICDE)","volume":"158 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE 39th International Conference on Data Engineering (ICDE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDE55515.2023.00308","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
We focus on the problem of event sequence clustering with different temporal patterns from the view of Reinforcement Learning (RL), whereby the observed sequences are assumed to be generated from a mixture of latent policies. We propose an Expectation-Maximization (EM) based algorithm to cluster the sequences with different temporal patterns into the underlying policies while simultaneously learning each of the policy model, in E-step estimating the cluster labels for each sequence, in M-step learning the respective policy. For each policy learning, we resort to Inverse Reinforcement Learning (IRL) by decomposing the observed sequence into states (hidden embedding of event history) and actions (time interval to next event) in order to learn a reward function. Experiments on synthetic and real-world datasets show the efficacy of our method against the state-of-the-arts.