Revisiting McFadden’s correction factor for sampling of alternatives in multinomial logit and mixed multinomial logit models

IF 6.3 1区工程技术 Q1 ECONOMICS

Transportation Research Part B-Methodological Pub Date : 2025-02-01 DOI:10.1016/j.trb.2024.103129

Thijs Dekker , Prateek Bansal , Jinghai Huo

{"title":"Revisiting McFadden’s correction factor for sampling of alternatives in multinomial logit and mixed multinomial logit models","authors":"Thijs Dekker , Prateek Bansal , Jinghai Huo","doi":"10.1016/j.trb.2024.103129","DOIUrl":null,"url":null,"abstract":"<div><div>When estimating multinomial logit (MNL) models where choices are made from a large set of available alternatives computational benefits can be achieved by estimating a quasi-likelihood function based on a sampled subset of alternatives in combination with ‘<em>McFadden’s correction factor</em>’. In this paper, we theoretically prove that McFadden’s correction factor minimises the expected information loss in the parameters of interest and thereby has convenient finite (and large sample) properties. That is, in the context of Bayesian estimation the use of sampling of alternatives in combination with McFadden’s correction factor provides the best approximation of the posterior distribution for the parameters of interest irrespective of sample size. As sample sizes become sufficiently large consistent point estimates for MNL can be obtained as per McFadden’s original proof. McFadden’s correction factor can therefore effectively be applied in the context of Bayesian MNL models. We extend these results to the context of mixed multinomial logit models (MMNL) by using the property of data augmentation in Bayesian estimation. McFadden’s correction factor minimises the expected information loss with respect to the augmented individual-level parameters, and in turn also for the population parameters characterising the shape and location of the mixing density in MMNL. Again, the results apply to finite and large samples and most importantly circumvent the need for additional correction factors previously identified for estimating MMNL models using maximum simulated likelihood. Monte Carlo simulations validate this result for sampling of alternatives in Bayesian MMNL models.</div></div>","PeriodicalId":54418,"journal":{"name":"Transportation Research Part B-Methodological","volume":"192 ","pages":"Article 103129"},"PeriodicalIF":6.3000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Transportation Research Part B-Methodological","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0191261524002534","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ECONOMICS","Score":null,"Total":0}

引用次数: 0

Abstract

When estimating multinomial logit (MNL) models where choices are made from a large set of available alternatives computational benefits can be achieved by estimating a quasi-likelihood function based on a sampled subset of alternatives in combination with ‘McFadden’s correction factor’. In this paper, we theoretically prove that McFadden’s correction factor minimises the expected information loss in the parameters of interest and thereby has convenient finite (and large sample) properties. That is, in the context of Bayesian estimation the use of sampling of alternatives in combination with McFadden’s correction factor provides the best approximation of the posterior distribution for the parameters of interest irrespective of sample size. As sample sizes become sufficiently large consistent point estimates for MNL can be obtained as per McFadden’s original proof. McFadden’s correction factor can therefore effectively be applied in the context of Bayesian MNL models. We extend these results to the context of mixed multinomial logit models (MMNL) by using the property of data augmentation in Bayesian estimation. McFadden’s correction factor minimises the expected information loss with respect to the augmented individual-level parameters, and in turn also for the population parameters characterising the shape and location of the mixing density in MMNL. Again, the results apply to finite and large samples and most importantly circumvent the need for additional correction factors previously identified for estimating MMNL models using maximum simulated likelihood. Monte Carlo simulations validate this result for sampling of alternatives in Bayesian MMNL models.

查看原文本刊更多论文

重新审视多项logit和混合多项logit模型中备选抽样的麦克法登校正因子

当估计多项logit （MNL）模型时，从大量可用的备选方案中做出选择，可以通过基于备选方案的抽样子集结合“McFadden校正因子”估计准似然函数来实现计算效益。在本文中，我们从理论上证明了McFadden校正因子最小化了感兴趣参数中的期望信息损失，从而具有方便的有限（和大样本）性质。也就是说，在贝叶斯估计的背景下，使用备选抽样与麦克法登校正因子相结合，无论样本大小如何，都可以提供感兴趣参数的后验分布的最佳近似值。当样本量变得足够大时，可以根据McFadden的原始证明获得MNL的一致点估计。因此，McFadden的校正因子可以有效地应用于贝叶斯MNL模型。我们利用贝叶斯估计中数据增广的性质，将这些结果推广到混合多项逻辑模型。McFadden的校正因子最小化了相对于增强的个体水平参数的预期信息损失，反过来也最小化了MMNL中表征混合密度形状和位置的总体参数的预期信息损失。同样，结果适用于有限和大样本，最重要的是避免了之前使用最大模拟似然估计MMNL模型时确定的额外校正因子的需要。蒙特卡罗模拟验证了贝叶斯MMNL模型中备选方案抽样的结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Transportation Research Part B-Methodological 工程技术-工程：土木

CiteScore

12.40

自引率

8.80%

发文量

143

审稿时长

14.1 weeks

期刊介绍： Transportation Research: Part B publishes papers on all methodological aspects of the subject, particularly those that require mathematical analysis. The general theme of the journal is the development and solution of problems that are adequately motivated to deal with important aspects of the design and/or analysis of transportation systems. Areas covered include: traffic flow; design and analysis of transportation networks; control and scheduling; optimization; queuing theory; logistics; supply chains; development and application of statistical, econometric and mathematical models to address transportation problems; cost models; pricing and/or investment; traveler or shipper behavior; cost-benefit methodologies.