STCDM: Spatio-Temporal Contrastive Diffusion Model for Check-In Sequence Generation

IF 8.9 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE Transactions on Knowledge and Data Engineering Pub Date : 2025-01-10 DOI:10.1109/TKDE.2025.3525718

Letian Gong;Shengnan Guo;Yan Lin;Yichen Liu;Erwen Zheng;Yiwei Shuang;Youfang Lin;Jilin Hu;Huaiyu Wan

{"title":"STCDM: Spatio-Temporal Contrastive Diffusion Model for Check-In Sequence Generation","authors":"Letian Gong;Shengnan Guo;Yan Lin;Yichen Liu;Erwen Zheng;Yiwei Shuang;Youfang Lin;Jilin Hu;Huaiyu Wan","doi":"10.1109/TKDE.2025.3525718","DOIUrl":null,"url":null,"abstract":"Analyzing and comprehending check-in sequences is crucial for various applications in smart cities. However, publicly available check-in datasets are often limited in scale due to privacy concerns. This poses a significant obstacle to academic research and downstream applications. Thus, it is urgent to generate realistic check-in datasets. The denoising diffusion probabilistic model (DDPM) as one of the most capable generation methods is a good choice to achieve this goal. However, generating check-in sequences using DDPM is not an easy feat. The difficulties lie in handling check-in sequences of variable lengths and capturing the correlation from check-in sequences’ distinct characteristics. This paper addresses the challenges by proposing a Spatio-Temporal Contrastive Diffusion Model (STCDM). This model introduces a novel spatio-temporal lossless encoding method that effectively encodes check-in sequences into a suitable format with equal length. Furthermore, we capture the spatio-temporal correlations with two disentangled diffusion modules to reduce the impact of the difference between spatial and temporal characteristics. Finally, we incorporate contrastive learning to enhance the relationship between diffusion modules. We generate four realistic datasets in different scenarios using STCDM and design four metrics for comparison. Experiments demonstrate that our generated datasets are more realistic and free of privacy leakage.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 4","pages":"2141-2154"},"PeriodicalIF":8.9000,"publicationDate":"2025-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Knowledge and Data Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10836764/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Analyzing and comprehending check-in sequences is crucial for various applications in smart cities. However, publicly available check-in datasets are often limited in scale due to privacy concerns. This poses a significant obstacle to academic research and downstream applications. Thus, it is urgent to generate realistic check-in datasets. The denoising diffusion probabilistic model (DDPM) as one of the most capable generation methods is a good choice to achieve this goal. However, generating check-in sequences using DDPM is not an easy feat. The difficulties lie in handling check-in sequences of variable lengths and capturing the correlation from check-in sequences’ distinct characteristics. This paper addresses the challenges by proposing a Spatio-Temporal Contrastive Diffusion Model (STCDM). This model introduces a novel spatio-temporal lossless encoding method that effectively encodes check-in sequences into a suitable format with equal length. Furthermore, we capture the spatio-temporal correlations with two disentangled diffusion modules to reduce the impact of the difference between spatial and temporal characteristics. Finally, we incorporate contrastive learning to enhance the relationship between diffusion modules. We generate four realistic datasets in different scenarios using STCDM and design four metrics for comparison. Experiments demonstrate that our generated datasets are more realistic and free of privacy leakage.

查看原文本刊更多论文

STCDM：用于生成签到序列的时空对比扩散模型

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Knowledge and Data Engineering 工程技术-工程：电子与电气

CiteScore

11.70

自引率

3.40%

发文量

515

审稿时长

6 months

期刊介绍： The IEEE Transactions on Knowledge and Data Engineering encompasses knowledge and data engineering aspects within computer science, artificial intelligence, electrical engineering, computer engineering, and related fields. It provides an interdisciplinary platform for disseminating new developments in knowledge and data engineering and explores the practicality of these concepts in both hardware and software. Specific areas covered include knowledge-based and expert systems, AI techniques for knowledge and data management, tools, and methodologies, distributed processing, real-time systems, architectures, data management practices, database design, query languages, security, fault tolerance, statistical databases, algorithms, performance evaluation, and applications.