Hierarchical Nearest Neighbor Gaussian Process models for discrete choice: Mode choice in New York City

IF 6.3 1区工程技术 Q1 ECONOMICS

Transportation Research Part B-Methodological Pub Date : 2024-11-28 DOI:10.1016/j.trb.2024.103132

Daniel F. Villarraga, Ricardo A. Daziano

{"title":"Hierarchical Nearest Neighbor Gaussian Process models for discrete choice: Mode choice in New York City","authors":"Daniel F. Villarraga, Ricardo A. Daziano","doi":"10.1016/j.trb.2024.103132","DOIUrl":null,"url":null,"abstract":"<div><div>Standard Discrete Choice Models (DCMs) assume that unobserved effects that influence decision-making are independently and identically distributed among individuals. When unobserved effects are spatially correlated, the independence assumption does not hold, leading to biased standard errors and potentially biased parameter estimates. This paper proposes an interpretable Hierarchical Nearest Neighbor Gaussian Process (HNNGP) model to account for spatially correlated unobservables in discrete choice analysis. Gaussian Processes (GPs) are often regarded as lacking interpretability due to their non-parametric nature. However, we demonstrate how to incorporate GPs directly into the latent utility specification to flexibly model spatially correlated unobserved effects without sacrificing structural economic interpretation. To empirically test our proposed HNNGP models, we analyze binary and multinomial mode choices for commuting to work in New York City. For the multinomial case, we formulate and estimate HNNGPs with and without independence from irrelevant alternatives (IIA). Building on the interpretability of our modeling strategy, we provide both point estimates and credible intervals for the value of travel time savings in NYC. Finally, we compare the results from all proposed specifications with those derived from a standard logit model and a probit model with spatially autocorrelated errors (SAE) to showcase how accounting for different sources of spatial correlation in discrete choice can significantly impact inference. We also show that the HNNGP models attain better out-of-sample prediction performance when compared to the logit and probit SAE models, especially in the multinomial case.</div></div>","PeriodicalId":54418,"journal":{"name":"Transportation Research Part B-Methodological","volume":"191 ","pages":"Article 103132"},"PeriodicalIF":6.3000,"publicationDate":"2024-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Transportation Research Part B-Methodological","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S019126152400256X","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ECONOMICS","Score":null,"Total":0}

引用次数: 0

Abstract

Standard Discrete Choice Models (DCMs) assume that unobserved effects that influence decision-making are independently and identically distributed among individuals. When unobserved effects are spatially correlated, the independence assumption does not hold, leading to biased standard errors and potentially biased parameter estimates. This paper proposes an interpretable Hierarchical Nearest Neighbor Gaussian Process (HNNGP) model to account for spatially correlated unobservables in discrete choice analysis. Gaussian Processes (GPs) are often regarded as lacking interpretability due to their non-parametric nature. However, we demonstrate how to incorporate GPs directly into the latent utility specification to flexibly model spatially correlated unobserved effects without sacrificing structural economic interpretation. To empirically test our proposed HNNGP models, we analyze binary and multinomial mode choices for commuting to work in New York City. For the multinomial case, we formulate and estimate HNNGPs with and without independence from irrelevant alternatives (IIA). Building on the interpretability of our modeling strategy, we provide both point estimates and credible intervals for the value of travel time savings in NYC. Finally, we compare the results from all proposed specifications with those derived from a standard logit model and a probit model with spatially autocorrelated errors (SAE) to showcase how accounting for different sources of spatial correlation in discrete choice can significantly impact inference. We also show that the HNNGP models attain better out-of-sample prediction performance when compared to the logit and probit SAE models, especially in the multinomial case.

查看原文本刊更多论文

离散选择的分层最近邻高斯过程模型：纽约市的模式选择

标准离散选择模型（dcm）假设影响决策的未观察到的效应在个体之间独立且相同地分布。当未观察到的效应在空间上相关时，独立性假设不成立，导致有偏的标准误差和潜在的有偏的参数估计。本文提出了一种可解释的层次最近邻高斯过程（HNNGP）模型来解释离散选择分析中空间相关的不可观测值。高斯过程（GPs）由于其非参数性质而经常被认为缺乏可解释性。然而，我们展示了如何将gp直接纳入潜在效用规范，以灵活地模拟空间相关的未观察到的效应，而不牺牲结构经济解释。为了对我们提出的HNNGP模型进行实证检验，我们分析了纽约市上下班通勤的二元和多项模式选择。对于多项情况，我们制定和估计hnngp是否独立于无关替代方案（IIA）。基于我们建模策略的可解释性，我们提供了在纽约市节省的旅行时间价值的点估计和可信区间。最后，我们将所有提出的规范的结果与标准logit模型和具有空间自相关误差（SAE）的probit模型的结果进行了比较，以展示如何在离散选择中考虑不同的空间相关性来源可以显著影响推理。我们还表明，与logit和probit SAE模型相比，HNNGP模型获得了更好的样本外预测性能，特别是在多项情况下。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Transportation Research Part B-Methodological 工程技术-工程：土木

CiteScore

12.40

自引率

8.80%

发文量

143

审稿时长

14.1 weeks

期刊介绍： Transportation Research: Part B publishes papers on all methodological aspects of the subject, particularly those that require mathematical analysis. The general theme of the journal is the development and solution of problems that are adequately motivated to deal with important aspects of the design and/or analysis of transportation systems. Areas covered include: traffic flow; design and analysis of transportation networks; control and scheduling; optimization; queuing theory; logistics; supply chains; development and application of statistical, econometric and mathematical models to address transportation problems; cost models; pricing and/or investment; traveler or shipper behavior; cost-benefit methodologies.