Letian Gong;Shengnan Guo;Yan Lin;Yichen Liu;Erwen Zheng;Yiwei Shuang;Youfang Lin;Jilin Hu;Huaiyu Wan
{"title":"STCDM: Spatio-Temporal Contrastive Diffusion Model for Check-In Sequence Generation","authors":"Letian Gong;Shengnan Guo;Yan Lin;Yichen Liu;Erwen Zheng;Yiwei Shuang;Youfang Lin;Jilin Hu;Huaiyu Wan","doi":"10.1109/TKDE.2025.3525718","DOIUrl":null,"url":null,"abstract":"Analyzing and comprehending check-in sequences is crucial for various applications in smart cities. However, publicly available check-in datasets are often limited in scale due to privacy concerns. This poses a significant obstacle to academic research and downstream applications. Thus, it is urgent to generate realistic check-in datasets. The denoising diffusion probabilistic model (DDPM) as one of the most capable generation methods is a good choice to achieve this goal. However, generating check-in sequences using DDPM is not an easy feat. The difficulties lie in handling check-in sequences of variable lengths and capturing the correlation from check-in sequences’ distinct characteristics. This paper addresses the challenges by proposing a Spatio-Temporal Contrastive Diffusion Model (STCDM). This model introduces a novel spatio-temporal lossless encoding method that effectively encodes check-in sequences into a suitable format with equal length. Furthermore, we capture the spatio-temporal correlations with two disentangled diffusion modules to reduce the impact of the difference between spatial and temporal characteristics. Finally, we incorporate contrastive learning to enhance the relationship between diffusion modules. We generate four realistic datasets in different scenarios using STCDM and design four metrics for comparison. Experiments demonstrate that our generated datasets are more realistic and free of privacy leakage.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 4","pages":"2141-2154"},"PeriodicalIF":8.9000,"publicationDate":"2025-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Knowledge and Data Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10836764/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Analyzing and comprehending check-in sequences is crucial for various applications in smart cities. However, publicly available check-in datasets are often limited in scale due to privacy concerns. This poses a significant obstacle to academic research and downstream applications. Thus, it is urgent to generate realistic check-in datasets. The denoising diffusion probabilistic model (DDPM) as one of the most capable generation methods is a good choice to achieve this goal. However, generating check-in sequences using DDPM is not an easy feat. The difficulties lie in handling check-in sequences of variable lengths and capturing the correlation from check-in sequences’ distinct characteristics. This paper addresses the challenges by proposing a Spatio-Temporal Contrastive Diffusion Model (STCDM). This model introduces a novel spatio-temporal lossless encoding method that effectively encodes check-in sequences into a suitable format with equal length. Furthermore, we capture the spatio-temporal correlations with two disentangled diffusion modules to reduce the impact of the difference between spatial and temporal characteristics. Finally, we incorporate contrastive learning to enhance the relationship between diffusion modules. We generate four realistic datasets in different scenarios using STCDM and design four metrics for comparison. Experiments demonstrate that our generated datasets are more realistic and free of privacy leakage.
期刊介绍:
The IEEE Transactions on Knowledge and Data Engineering encompasses knowledge and data engineering aspects within computer science, artificial intelligence, electrical engineering, computer engineering, and related fields. It provides an interdisciplinary platform for disseminating new developments in knowledge and data engineering and explores the practicality of these concepts in both hardware and software. Specific areas covered include knowledge-based and expert systems, AI techniques for knowledge and data management, tools, and methodologies, distributed processing, real-time systems, architectures, data management practices, database design, query languages, security, fault tolerance, statistical databases, algorithms, performance evaluation, and applications.