Mohamed Khachman, Catherine Morency, Francesco Ciari
{"title":"基于机器学习的新型空间化种群合成框架","authors":"Mohamed Khachman, Catherine Morency, Francesco Ciari","doi":"10.1007/s11116-024-10534-0","DOIUrl":null,"url":null,"abstract":"<p>Synthetic populations are increasingly required in transportation demand modelling practice to feed the large-scale agent-based microsimulation platforms gaining in popularity. The quality of the synthetic population, i.e., its representativeness of the sociodemographic and the spatial distribution of the real population, is a determinant factor of the reliability of the microsimulation it feeds. While many research works focused on improving the sociodemographic accuracy of synthetic populations, the quality of their spatial distribution remained less covered. This paper suggests a new explicitly spatialized population synthesis framework. It leverages the performant Clustering Large Applications (CLARA) and Random Forest algorithms as well as rich spatial information collected as part of surveys to make accurate predictions of synthetic households’ locations at the building scale directly. In addition to preserving optimal sociodemographic accuracy and achieving realistic explicit spatialization, the new framework shows acceptable transferability thanks to CLARA’s efficiency. An explicitly spatialized synthetic population for Montreal Island is generated using the proposed clustering + classification framework. The four components of the proposed framework have generated satisfactory results with the zonal synthetic population established showing a 2.85% average relative error, the building clustering selected having a 0.48 average silhouette width, the classification model achieving a 0.79 macro-average F1 score, and 78.9% of the synthetic households being assigned to their preferred building cluster.</p>","PeriodicalId":49419,"journal":{"name":"Transportation","volume":"13 1","pages":""},"PeriodicalIF":3.5000,"publicationDate":"2024-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A novel machine learning-based spatialized population synthesis framework\",\"authors\":\"Mohamed Khachman, Catherine Morency, Francesco Ciari\",\"doi\":\"10.1007/s11116-024-10534-0\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Synthetic populations are increasingly required in transportation demand modelling practice to feed the large-scale agent-based microsimulation platforms gaining in popularity. The quality of the synthetic population, i.e., its representativeness of the sociodemographic and the spatial distribution of the real population, is a determinant factor of the reliability of the microsimulation it feeds. While many research works focused on improving the sociodemographic accuracy of synthetic populations, the quality of their spatial distribution remained less covered. This paper suggests a new explicitly spatialized population synthesis framework. It leverages the performant Clustering Large Applications (CLARA) and Random Forest algorithms as well as rich spatial information collected as part of surveys to make accurate predictions of synthetic households’ locations at the building scale directly. In addition to preserving optimal sociodemographic accuracy and achieving realistic explicit spatialization, the new framework shows acceptable transferability thanks to CLARA’s efficiency. An explicitly spatialized synthetic population for Montreal Island is generated using the proposed clustering + classification framework. The four components of the proposed framework have generated satisfactory results with the zonal synthetic population established showing a 2.85% average relative error, the building clustering selected having a 0.48 average silhouette width, the classification model achieving a 0.79 macro-average F1 score, and 78.9% of the synthetic households being assigned to their preferred building cluster.</p>\",\"PeriodicalId\":49419,\"journal\":{\"name\":\"Transportation\",\"volume\":\"13 1\",\"pages\":\"\"},\"PeriodicalIF\":3.5000,\"publicationDate\":\"2024-08-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Transportation\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1007/s11116-024-10534-0\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, CIVIL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Transportation","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1007/s11116-024-10534-0","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, CIVIL","Score":null,"Total":0}
引用次数: 0
摘要
在交通需求建模实践中,越来越多地需要合成人口来为日益流行的基于代理的大规模微观模拟平台提供信息。合成人口的质量,即其对真实人口的社会人口和空间分布的代表性,是其所提供的微观模拟可靠性的决定性因素。虽然许多研究工作都侧重于提高合成人口的社会人口准确性,但对其空间分布质量的研究仍然较少。本文提出了一种新的明确空间化人口合成框架。它利用性能卓越的大型应用聚类(CLARA)和随机森林算法,以及作为调查一部分收集到的丰富空间信息,直接在建筑物尺度上准确预测合成家庭的位置。除了保持最佳的社会人口准确性和实现现实的显式空间化外,由于 CLARA 算法的高效性,新框架还显示了可接受的可移植性。利用提出的聚类 + 分类框架,为蒙特利尔岛生成了明确空间化的合成人口。建议框架的四个组成部分都取得了令人满意的结果:建立的分区合成人口显示出 2.85% 的平均相对误差,选定的建筑聚类具有 0.48 的平均轮廓宽度,分类模型取得了 0.79 的宏观平均 F1 分数,78.9% 的合成住户被分配到其偏好的建筑群中。
A novel machine learning-based spatialized population synthesis framework
Synthetic populations are increasingly required in transportation demand modelling practice to feed the large-scale agent-based microsimulation platforms gaining in popularity. The quality of the synthetic population, i.e., its representativeness of the sociodemographic and the spatial distribution of the real population, is a determinant factor of the reliability of the microsimulation it feeds. While many research works focused on improving the sociodemographic accuracy of synthetic populations, the quality of their spatial distribution remained less covered. This paper suggests a new explicitly spatialized population synthesis framework. It leverages the performant Clustering Large Applications (CLARA) and Random Forest algorithms as well as rich spatial information collected as part of surveys to make accurate predictions of synthetic households’ locations at the building scale directly. In addition to preserving optimal sociodemographic accuracy and achieving realistic explicit spatialization, the new framework shows acceptable transferability thanks to CLARA’s efficiency. An explicitly spatialized synthetic population for Montreal Island is generated using the proposed clustering + classification framework. The four components of the proposed framework have generated satisfactory results with the zonal synthetic population established showing a 2.85% average relative error, the building clustering selected having a 0.48 average silhouette width, the classification model achieving a 0.79 macro-average F1 score, and 78.9% of the synthetic households being assigned to their preferred building cluster.
期刊介绍:
In our first issue, published in 1972, we explained that this Journal is intended to promote the free and vigorous exchange of ideas and experience among the worldwide community actively concerned with transportation policy, planning and practice. That continues to be our mission, with a clear focus on topics concerned with research and practice in transportation policy and planning, around the world.
These four words, policy and planning, research and practice are our key words. While we have a particular focus on transportation policy analysis and travel behaviour in the context of ground transportation, we willingly consider all good quality papers that are highly relevant to transportation policy, planning and practice with a clear focus on innovation, on extending the international pool of knowledge and understanding. Our interest is not only with transportation policies - and systems and services – but also with their social, economic and environmental impacts, However, papers about the application of established procedures to, or the development of plans or policies for, specific locations are unlikely to prove acceptable unless they report experience which will be of real benefit those working elsewhere. Papers concerned with the engineering, safety and operational management of transportation systems are outside our scope.