利用被动收集的移动数据生成空间异构合成群体的新型数据融合方法

IF 5.8 1区 工程技术 Q1 ECONOMICS
Khoa D. Vo , Eui-Jin Kim , Prateek Bansal
{"title":"利用被动收集的移动数据生成空间异构合成群体的新型数据融合方法","authors":"Khoa D. Vo ,&nbsp;Eui-Jin Kim ,&nbsp;Prateek Bansal","doi":"10.1016/j.trb.2024.103128","DOIUrl":null,"url":null,"abstract":"<div><div>Conventional methods to synthesize population use household travel survey (HTS) data. They generate many infeasible attribute values due to sequentially generating sociodemographics and spatial attributes and encounter a low spatial heterogeneity issue due to a low sampling rate of the HTS data. Passively collected mobility (PCM) data (e.g., cellular traces) provides extensive spatial coverage but poses integration challenges with HTS data due to differences in spatial resolution and attributes. This study introduces a novel cluster-based data fusion method to address these limitations and simultaneously generate synthetic populations with accurate sociodemographics and home–work locations at high spatial heterogeneity. Spatial clustering is adopted to align the spatial resolution of HTS and PCM data, facilitating effective data integration. The data fusion process is reformulated into cluster-specific low-dimensional optimization subproblems to ensure computational tractability. Analytical properties are derived to retain essential distributional characteristics from both datasets in the fused distribution. The spatial clustering process is optimized to ensure such distributional consistencies while maintaining a balance between feasibility and heterogeneity of the synthetic population. The data fusion properties are validated using HTS and LTE/5G cellular signaling data from Seoul, South Korea. Validation against census data confirms the method’s efficacy in maintaining distributional consistency while increasing spatial heterogeneity, with 97% of the generated population being unobserved in the HTS data. This research advances methods to synthesize a population by leveraging the complementary strengths of HTS and PCM data, providing a robust framework for generating spatially diverse synthetic populations essential for urban planning.</div></div>","PeriodicalId":54418,"journal":{"name":"Transportation Research Part B-Methodological","volume":"191 ","pages":"Article 103128"},"PeriodicalIF":5.8000,"publicationDate":"2024-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A novel data fusion method to leverage passively-collected mobility data in generating spatially-heterogeneous synthetic population\",\"authors\":\"Khoa D. Vo ,&nbsp;Eui-Jin Kim ,&nbsp;Prateek Bansal\",\"doi\":\"10.1016/j.trb.2024.103128\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Conventional methods to synthesize population use household travel survey (HTS) data. They generate many infeasible attribute values due to sequentially generating sociodemographics and spatial attributes and encounter a low spatial heterogeneity issue due to a low sampling rate of the HTS data. Passively collected mobility (PCM) data (e.g., cellular traces) provides extensive spatial coverage but poses integration challenges with HTS data due to differences in spatial resolution and attributes. This study introduces a novel cluster-based data fusion method to address these limitations and simultaneously generate synthetic populations with accurate sociodemographics and home–work locations at high spatial heterogeneity. Spatial clustering is adopted to align the spatial resolution of HTS and PCM data, facilitating effective data integration. The data fusion process is reformulated into cluster-specific low-dimensional optimization subproblems to ensure computational tractability. Analytical properties are derived to retain essential distributional characteristics from both datasets in the fused distribution. The spatial clustering process is optimized to ensure such distributional consistencies while maintaining a balance between feasibility and heterogeneity of the synthetic population. The data fusion properties are validated using HTS and LTE/5G cellular signaling data from Seoul, South Korea. Validation against census data confirms the method’s efficacy in maintaining distributional consistency while increasing spatial heterogeneity, with 97% of the generated population being unobserved in the HTS data. This research advances methods to synthesize a population by leveraging the complementary strengths of HTS and PCM data, providing a robust framework for generating spatially diverse synthetic populations essential for urban planning.</div></div>\",\"PeriodicalId\":54418,\"journal\":{\"name\":\"Transportation Research Part B-Methodological\",\"volume\":\"191 \",\"pages\":\"Article 103128\"},\"PeriodicalIF\":5.8000,\"publicationDate\":\"2024-11-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Transportation Research Part B-Methodological\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0191261524002522\",\"RegionNum\":1,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ECONOMICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Transportation Research Part B-Methodological","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0191261524002522","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ECONOMICS","Score":null,"Total":0}
引用次数: 0

摘要

综合人口的传统方法使用家庭旅行调查(HTS)数据。由于按顺序生成社会人口和空间属性,这些方法会产生许多不可行的属性值,而且由于家庭旅行调查数据的采样率较低,还会遇到空间异质性较低的问题。被动收集的移动(PCM)数据(如蜂窝跟踪)提供了广泛的空间覆盖范围,但由于空间分辨率和属性的差异,给 HTS 数据的整合带来了挑战。本研究引入了一种新颖的基于聚类的数据融合方法来解决这些局限性,并同时生成具有准确社会人口学和家庭工作地点的高空间异质性合成人口。采用空间聚类来调整 HTS 和 PCM 数据的空间分辨率,从而促进有效的数据融合。数据融合过程被重新表述为特定聚类的低维优化子问题,以确保计算的可操作性。得出的分析特性可在融合分布中保留两个数据集的基本分布特征。对空间聚类过程进行了优化,以确保这种分布一致性,同时保持合成群体的可行性和异质性之间的平衡。利用韩国首尔的 HTS 和 LTE/5G 蜂窝信令数据对数据融合特性进行了验证。根据人口普查数据进行的验证证实了该方法在保持分布一致性的同时增加空间异质性的功效,生成的人口中有 97% 在 HTS 数据中未被观测到。这项研究通过利用 HTS 和 PCM 数据的互补优势,推进了合成人口的方法,为生成城市规划所需的空间多样性合成人口提供了一个强大的框架。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A novel data fusion method to leverage passively-collected mobility data in generating spatially-heterogeneous synthetic population
Conventional methods to synthesize population use household travel survey (HTS) data. They generate many infeasible attribute values due to sequentially generating sociodemographics and spatial attributes and encounter a low spatial heterogeneity issue due to a low sampling rate of the HTS data. Passively collected mobility (PCM) data (e.g., cellular traces) provides extensive spatial coverage but poses integration challenges with HTS data due to differences in spatial resolution and attributes. This study introduces a novel cluster-based data fusion method to address these limitations and simultaneously generate synthetic populations with accurate sociodemographics and home–work locations at high spatial heterogeneity. Spatial clustering is adopted to align the spatial resolution of HTS and PCM data, facilitating effective data integration. The data fusion process is reformulated into cluster-specific low-dimensional optimization subproblems to ensure computational tractability. Analytical properties are derived to retain essential distributional characteristics from both datasets in the fused distribution. The spatial clustering process is optimized to ensure such distributional consistencies while maintaining a balance between feasibility and heterogeneity of the synthetic population. The data fusion properties are validated using HTS and LTE/5G cellular signaling data from Seoul, South Korea. Validation against census data confirms the method’s efficacy in maintaining distributional consistency while increasing spatial heterogeneity, with 97% of the generated population being unobserved in the HTS data. This research advances methods to synthesize a population by leveraging the complementary strengths of HTS and PCM data, providing a robust framework for generating spatially diverse synthetic populations essential for urban planning.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Transportation Research Part B-Methodological
Transportation Research Part B-Methodological 工程技术-工程:土木
CiteScore
12.40
自引率
8.80%
发文量
143
审稿时长
14.1 weeks
期刊介绍: Transportation Research: Part B publishes papers on all methodological aspects of the subject, particularly those that require mathematical analysis. The general theme of the journal is the development and solution of problems that are adequately motivated to deal with important aspects of the design and/or analysis of transportation systems. Areas covered include: traffic flow; design and analysis of transportation networks; control and scheduling; optimization; queuing theory; logistics; supply chains; development and application of statistical, econometric and mathematical models to address transportation problems; cost models; pricing and/or investment; traveler or shipper behavior; cost-benefit methodologies.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信