样本内选择分布和样本量对logit模型估计精度的影响

2015 International Conference on Transportation Information and Safety (ICTIS) Pub Date : 2015-06-25 DOI:10.1109/ICTIS.2015.7232160

Minhui Zeng, M. Zhong, J. Hunt

{"title":"样本内选择分布和样本量对logit模型估计精度的影响","authors":"Minhui Zeng, M. Zhong, J. Hunt","doi":"10.1109/ICTIS.2015.7232160","DOIUrl":null,"url":null,"abstract":"Within-Sample Choice Distribution and Sample size are important considerations in the estimation of logit model, but their effects on the estimation accuracy have not been systematically studied. Therefore, the objective of this paper is to provide an empirical examination to the above issues through a set of simulated choice datasets. In this paper, the utility function coefficients and alternative specific constants (ASCs) are specified as a prior. Then, assuming alternative attributes and error components follow a normal distribution, both revealed preference (RP) and stated preference (SP) synthetic choice datasets are simulated. Based on these simulated datasets, the utility coefficients and ASCs are re-estimated and compared with the original values specified as the prior. It is found that the utility coefficients can be re-estimated with reasonable accuracy, but the estimates of the ASCs are confronted with much larger errors. The Sum of Square Errors (SSEs) between the “original” and the estimated utility coefficients and ASCs using RP and SP datasets of varying sample size are calculated, plotted and the corresponding diminishing marginal return points are identified. Regarding within-sample choice distribution, study results show that, as the within-sample choice distribution becomes more balanced, the hit-ratio decreases. It appears that, when alternatives are chosen with similar frequency, choosing one alternative vs. another does not make much difference in terms of utility perceived by each decision-maker. Therefore, it is suggested that a population with varying socioeconomic characteristics be created and used in future studies.","PeriodicalId":389628,"journal":{"name":"2015 International Conference on Transportation Information and Safety (ICTIS)","volume":"12 18","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Effect of within-sample choice distribution and sample size on the estimation accuracy of logit model\",\"authors\":\"Minhui Zeng, M. Zhong, J. Hunt\",\"doi\":\"10.1109/ICTIS.2015.7232160\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Within-Sample Choice Distribution and Sample size are important considerations in the estimation of logit model, but their effects on the estimation accuracy have not been systematically studied. Therefore, the objective of this paper is to provide an empirical examination to the above issues through a set of simulated choice datasets. In this paper, the utility function coefficients and alternative specific constants (ASCs) are specified as a prior. Then, assuming alternative attributes and error components follow a normal distribution, both revealed preference (RP) and stated preference (SP) synthetic choice datasets are simulated. Based on these simulated datasets, the utility coefficients and ASCs are re-estimated and compared with the original values specified as the prior. It is found that the utility coefficients can be re-estimated with reasonable accuracy, but the estimates of the ASCs are confronted with much larger errors. The Sum of Square Errors (SSEs) between the “original” and the estimated utility coefficients and ASCs using RP and SP datasets of varying sample size are calculated, plotted and the corresponding diminishing marginal return points are identified. Regarding within-sample choice distribution, study results show that, as the within-sample choice distribution becomes more balanced, the hit-ratio decreases. It appears that, when alternatives are chosen with similar frequency, choosing one alternative vs. another does not make much difference in terms of utility perceived by each decision-maker. Therefore, it is suggested that a population with varying socioeconomic characteristics be created and used in future studies.\",\"PeriodicalId\":389628,\"journal\":{\"name\":\"2015 International Conference on Transportation Information and Safety (ICTIS)\",\"volume\":\"12 18\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-06-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 International Conference on Transportation Information and Safety (ICTIS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICTIS.2015.7232160\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 International Conference on Transportation Information and Safety (ICTIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICTIS.2015.7232160","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

样本内选择分布和样本容量是logit模型估计中的重要考虑因素，但它们对估计精度的影响尚未得到系统的研究。因此，本文的目的是通过一组模拟选择数据集对上述问题进行实证检验。在本文中，效用函数系数和可选比常数(ASCs)被指定为先验。然后，假设可选属性和误差分量服从正态分布，分别模拟了显示偏好(RP)和陈述偏好(SP)合成选择数据集。基于这些模拟数据集，重新估计了效用系数和ASCs，并与先验指定的原始值进行了比较。研究发现，效用系数可以在合理的精度下进行重新估计，但其估计值存在较大的误差。使用不同样本量的RP和SP数据集，计算和绘制了“原始”和估计效用系数以及ASCs之间的平方和误差(sse)，并确定了相应的边际收益递减点。对于样本内选择分布，研究结果表明，随着样本内选择分布变得更加平衡，命中率降低。似乎，当选择的频率相似时，选择一种替代方案与另一种替代方案在每个决策者感知的效用方面没有太大区别。因此，建议创建具有不同社会经济特征的人口，并在未来的研究中使用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Effect of within-sample choice distribution and sample size on the estimation accuracy of logit model

Within-Sample Choice Distribution and Sample size are important considerations in the estimation of logit model, but their effects on the estimation accuracy have not been systematically studied. Therefore, the objective of this paper is to provide an empirical examination to the above issues through a set of simulated choice datasets. In this paper, the utility function coefficients and alternative specific constants (ASCs) are specified as a prior. Then, assuming alternative attributes and error components follow a normal distribution, both revealed preference (RP) and stated preference (SP) synthetic choice datasets are simulated. Based on these simulated datasets, the utility coefficients and ASCs are re-estimated and compared with the original values specified as the prior. It is found that the utility coefficients can be re-estimated with reasonable accuracy, but the estimates of the ASCs are confronted with much larger errors. The Sum of Square Errors (SSEs) between the “original” and the estimated utility coefficients and ASCs using RP and SP datasets of varying sample size are calculated, plotted and the corresponding diminishing marginal return points are identified. Regarding within-sample choice distribution, study results show that, as the within-sample choice distribution becomes more balanced, the hit-ratio decreases. It appears that, when alternatives are chosen with similar frequency, choosing one alternative vs. another does not make much difference in terms of utility perceived by each decision-maker. Therefore, it is suggested that a population with varying socioeconomic characteristics be created and used in future studies.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2015 International Conference on Transportation Information and Safety (ICTIS)

自引率

0.00%

发文量