{"title":"利用监督机器学习推断时间使用调查中未报告的家庭内外情况","authors":"Shunsuke Arao, Takuya Maruyama","doi":"10.1016/j.tbs.2024.100928","DOIUrl":null,"url":null,"abstract":"<div><div>Time-use surveys provide useful data for travel analyses. However, the survey on time use and leisure activities (TULA) Questionnaire A, a representative time-use survey in Japan, does not include questions related to the locations of activities, thus making it difficult to use for travel analyses. This study proposes machine-learning methods to determine the in-home/out-of-home situations of TULA Questionnaire A using TULA Questionnaire B with activity locations as the training data. Random forest performs better than logistic regression and decision trees in the inference. The activity was the most important factor in determining the in-home/out-of-home situations, followed by the accompanying person and time of day. The inferred outputs in the TULA Questionnaire A included the individual-based out-of-home rate profiles and the proportions of mobile persons from 1996 to 2016. Using these outputs, we analyzed trip misreporting in household travel surveys. Comparisons with nationwide and Tokyo person trip (PT) surveys implied soft refusals and trip misreporting in travel surveys. The comparison with the nationwide PT surveys suggested higher soft refusals on weekends than on weekdays. The comparison with the 1998, 2008, and 2018 Tokyo PT surveys implied the increased soft refusal in PT surveys, particularly among the male group aged between 20 and 39 and the female group aged between 35 and 49 during 1998–2018. These results suggest that careful handling of recent household travel survey data may be required. In addition, the proposed machine-learning-based method enables us to utilize the rich sample of Questionnaire A for activity-based travel analysis in future studies.</div></div>","PeriodicalId":51534,"journal":{"name":"Travel Behaviour and Society","volume":"38 ","pages":"Article 100928"},"PeriodicalIF":5.1000,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Inferring in-home/out-of-home situations unreported in time-use surveys using supervised machine learning\",\"authors\":\"Shunsuke Arao, Takuya Maruyama\",\"doi\":\"10.1016/j.tbs.2024.100928\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Time-use surveys provide useful data for travel analyses. However, the survey on time use and leisure activities (TULA) Questionnaire A, a representative time-use survey in Japan, does not include questions related to the locations of activities, thus making it difficult to use for travel analyses. This study proposes machine-learning methods to determine the in-home/out-of-home situations of TULA Questionnaire A using TULA Questionnaire B with activity locations as the training data. Random forest performs better than logistic regression and decision trees in the inference. The activity was the most important factor in determining the in-home/out-of-home situations, followed by the accompanying person and time of day. The inferred outputs in the TULA Questionnaire A included the individual-based out-of-home rate profiles and the proportions of mobile persons from 1996 to 2016. Using these outputs, we analyzed trip misreporting in household travel surveys. Comparisons with nationwide and Tokyo person trip (PT) surveys implied soft refusals and trip misreporting in travel surveys. The comparison with the nationwide PT surveys suggested higher soft refusals on weekends than on weekdays. The comparison with the 1998, 2008, and 2018 Tokyo PT surveys implied the increased soft refusal in PT surveys, particularly among the male group aged between 20 and 39 and the female group aged between 35 and 49 during 1998–2018. These results suggest that careful handling of recent household travel survey data may be required. In addition, the proposed machine-learning-based method enables us to utilize the rich sample of Questionnaire A for activity-based travel analysis in future studies.</div></div>\",\"PeriodicalId\":51534,\"journal\":{\"name\":\"Travel Behaviour and Society\",\"volume\":\"38 \",\"pages\":\"Article 100928\"},\"PeriodicalIF\":5.1000,\"publicationDate\":\"2024-10-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Travel Behaviour and Society\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2214367X24001911\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"TRANSPORTATION\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Travel Behaviour and Society","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2214367X24001911","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"TRANSPORTATION","Score":null,"Total":0}
引用次数: 0
摘要
时间利用调查为旅行分析提供了有用的数据。然而,作为日本具有代表性的时间利用调查,时间利用和休闲活动调查(TULA)问卷 A 并不包括与活动地点相关的问题,因此难以用于旅行分析。本研究提出了机器学习方法,使用带有活动地点作为训练数据的 TULA 问卷 B 来确定 TULA 问卷 A 的在家/外出情况。在推理中,随机森林的表现优于逻辑回归和决策树。活动是决定在家/不在家情况的最重要因素,其次是陪同人员和时间。TULA 问卷 A 的推断输出包括 1996 年至 2016 年基于个人的外出率概况和流动人员比例。利用这些输出结果,我们分析了家庭旅行调查中的旅行误报情况。与全国和东京个人旅行(PT)调查的比较意味着旅行调查中的软拒绝和旅行误报。与全国范围的 PT 调查相比,周末的软拒绝率高于工作日。与 1998 年、2008 年和 2018 年东京公共交通调查的比较表明,在 1998 年至 2018 年期间,公共交通调查中的软拒绝现象有所增加,尤其是在 20 岁至 39 岁的男性群体和 35 岁至 49 岁的女性群体中。这些结果表明,可能需要谨慎处理近期的家庭旅行调查数据。此外,建议的基于机器学习的方法使我们能够在未来的研究中利用问卷 A 的丰富样本进行基于活动的旅行分析。
Inferring in-home/out-of-home situations unreported in time-use surveys using supervised machine learning
Time-use surveys provide useful data for travel analyses. However, the survey on time use and leisure activities (TULA) Questionnaire A, a representative time-use survey in Japan, does not include questions related to the locations of activities, thus making it difficult to use for travel analyses. This study proposes machine-learning methods to determine the in-home/out-of-home situations of TULA Questionnaire A using TULA Questionnaire B with activity locations as the training data. Random forest performs better than logistic regression and decision trees in the inference. The activity was the most important factor in determining the in-home/out-of-home situations, followed by the accompanying person and time of day. The inferred outputs in the TULA Questionnaire A included the individual-based out-of-home rate profiles and the proportions of mobile persons from 1996 to 2016. Using these outputs, we analyzed trip misreporting in household travel surveys. Comparisons with nationwide and Tokyo person trip (PT) surveys implied soft refusals and trip misreporting in travel surveys. The comparison with the nationwide PT surveys suggested higher soft refusals on weekends than on weekdays. The comparison with the 1998, 2008, and 2018 Tokyo PT surveys implied the increased soft refusal in PT surveys, particularly among the male group aged between 20 and 39 and the female group aged between 35 and 49 during 1998–2018. These results suggest that careful handling of recent household travel survey data may be required. In addition, the proposed machine-learning-based method enables us to utilize the rich sample of Questionnaire A for activity-based travel analysis in future studies.
期刊介绍:
Travel Behaviour and Society is an interdisciplinary journal publishing high-quality original papers which report leading edge research in theories, methodologies and applications concerning transportation issues and challenges which involve the social and spatial dimensions. In particular, it provides a discussion forum for major research in travel behaviour, transportation infrastructure, transportation and environmental issues, mobility and social sustainability, transportation geographic information systems (TGIS), transportation and quality of life, transportation data collection and analysis, etc.