LibCity-Dataset: A Standardized and Comprehensive Dataset for Urban Spatial-temporal Data Mining

Jingyuan Wang, Wenjun Jiang, Jiawei Jiang
{"title":"LibCity-Dataset: A Standardized and Comprehensive Dataset for Urban Spatial-temporal Data Mining","authors":"Jingyuan Wang, Wenjun Jiang, Jiawei Jiang","doi":"10.1093/iti/liad021","DOIUrl":null,"url":null,"abstract":"Abstract The LibCity-Dataset represents a significant contribution to the field of urban spatial-temporal data mining. This dataset uniquely integrates macro traffic state data with micro trajectory data, providing researchers with comprehensive and diverse urban spatial-temporal data. Specifically, we begin by collecting and processing existing open-source spatial-temporal data. Subsequently, we independently collected Beijing taxi trajectory data through third-party interfaces. This data bridges the gap in the scarcity of current open-source vehicle trajectory data. The distinctive aspect of the LibCity-Dataset lies in its innovative approach of standardizing the storage format, achieved through the implementation of atomic files. By adopting this standardized format, diverse data sources are harmonized, enabling effortless application of spatial-temporal prediction models across various datasets. The uniform storage format not only simplifies experimentation but also expedites the advancement of spatial-temporal prediction research, acting as a catalyst for further innovation. This Data Note provides a comprehensive insight into the creation methodology of the LibCity-Dataset, including data collection and processing methodology, data description, data validation, and usage notes. By facilitating open-source collaboration and setting a benchmark for standardization within the spatial-temporal prediction domain, this dataset aims to foster increased research cooperation and knowledge sharing. The open-source link of our dataset is https://drive.google.com/drive/folders/1g5v2Gq1tkOq8XO0HDCZ9nOTtRpB6-gPe.","PeriodicalId":479889,"journal":{"name":"Intelligent Transportation Infrastructure","volume":"126 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Intelligent Transportation Infrastructure","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/iti/liad021","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Abstract The LibCity-Dataset represents a significant contribution to the field of urban spatial-temporal data mining. This dataset uniquely integrates macro traffic state data with micro trajectory data, providing researchers with comprehensive and diverse urban spatial-temporal data. Specifically, we begin by collecting and processing existing open-source spatial-temporal data. Subsequently, we independently collected Beijing taxi trajectory data through third-party interfaces. This data bridges the gap in the scarcity of current open-source vehicle trajectory data. The distinctive aspect of the LibCity-Dataset lies in its innovative approach of standardizing the storage format, achieved through the implementation of atomic files. By adopting this standardized format, diverse data sources are harmonized, enabling effortless application of spatial-temporal prediction models across various datasets. The uniform storage format not only simplifies experimentation but also expedites the advancement of spatial-temporal prediction research, acting as a catalyst for further innovation. This Data Note provides a comprehensive insight into the creation methodology of the LibCity-Dataset, including data collection and processing methodology, data description, data validation, and usage notes. By facilitating open-source collaboration and setting a benchmark for standardization within the spatial-temporal prediction domain, this dataset aims to foster increased research cooperation and knowledge sharing. The open-source link of our dataset is https://drive.google.com/drive/folders/1g5v2Gq1tkOq8XO0HDCZ9nOTtRpB6-gPe.
LibCity-Dataset:面向城市时空数据挖掘的标准化综合数据集
LibCity-Dataset在城市时空数据挖掘领域做出了重大贡献。该数据集独特地将宏观交通状态数据与微观轨道数据相结合,为研究人员提供了全面多样的城市时空数据。具体来说,我们首先收集和处理现有的开源时空数据。随后,我们通过第三方接口独立采集北京出租车轨迹数据。这些数据弥补了目前开源车辆轨迹数据稀缺的不足。LibCity-Dataset的独特之处在于其通过实现原子文件实现的标准化存储格式的创新方法。通过采用这种标准化格式,可以协调各种数据源,从而轻松地跨各种数据集应用时空预测模型。统一的存储格式不仅简化了实验,而且加快了时空预测研究的进展,成为进一步创新的催化剂。本数据说明提供了对LibCity-Dataset创建方法的全面见解,包括数据收集和处理方法、数据描述、数据验证和使用说明。通过促进开源协作和为时空预测领域的标准化设定基准,该数据集旨在促进更多的研究合作和知识共享。我们的数据集的开源链接是https://drive.google.com/drive/folders/1g5v2Gq1tkOq8XO0HDCZ9nOTtRpB6-gPe。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信