{"title":"样本内选择分布和样本量对logit模型估计精度的影响","authors":"Minhui Zeng, M. Zhong, J. Hunt","doi":"10.1109/ICTIS.2015.7232160","DOIUrl":null,"url":null,"abstract":"Within-Sample Choice Distribution and Sample size are important considerations in the estimation of logit model, but their effects on the estimation accuracy have not been systematically studied. Therefore, the objective of this paper is to provide an empirical examination to the above issues through a set of simulated choice datasets. In this paper, the utility function coefficients and alternative specific constants (ASCs) are specified as a prior. Then, assuming alternative attributes and error components follow a normal distribution, both revealed preference (RP) and stated preference (SP) synthetic choice datasets are simulated. Based on these simulated datasets, the utility coefficients and ASCs are re-estimated and compared with the original values specified as the prior. It is found that the utility coefficients can be re-estimated with reasonable accuracy, but the estimates of the ASCs are confronted with much larger errors. The Sum of Square Errors (SSEs) between the “original” and the estimated utility coefficients and ASCs using RP and SP datasets of varying sample size are calculated, plotted and the corresponding diminishing marginal return points are identified. Regarding within-sample choice distribution, study results show that, as the within-sample choice distribution becomes more balanced, the hit-ratio decreases. It appears that, when alternatives are chosen with similar frequency, choosing one alternative vs. another does not make much difference in terms of utility perceived by each decision-maker. Therefore, it is suggested that a population with varying socioeconomic characteristics be created and used in future studies.","PeriodicalId":389628,"journal":{"name":"2015 International Conference on Transportation Information and Safety (ICTIS)","volume":"12 18","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Effect of within-sample choice distribution and sample size on the estimation accuracy of logit model\",\"authors\":\"Minhui Zeng, M. Zhong, J. Hunt\",\"doi\":\"10.1109/ICTIS.2015.7232160\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Within-Sample Choice Distribution and Sample size are important considerations in the estimation of logit model, but their effects on the estimation accuracy have not been systematically studied. Therefore, the objective of this paper is to provide an empirical examination to the above issues through a set of simulated choice datasets. In this paper, the utility function coefficients and alternative specific constants (ASCs) are specified as a prior. Then, assuming alternative attributes and error components follow a normal distribution, both revealed preference (RP) and stated preference (SP) synthetic choice datasets are simulated. Based on these simulated datasets, the utility coefficients and ASCs are re-estimated and compared with the original values specified as the prior. It is found that the utility coefficients can be re-estimated with reasonable accuracy, but the estimates of the ASCs are confronted with much larger errors. The Sum of Square Errors (SSEs) between the “original” and the estimated utility coefficients and ASCs using RP and SP datasets of varying sample size are calculated, plotted and the corresponding diminishing marginal return points are identified. Regarding within-sample choice distribution, study results show that, as the within-sample choice distribution becomes more balanced, the hit-ratio decreases. It appears that, when alternatives are chosen with similar frequency, choosing one alternative vs. another does not make much difference in terms of utility perceived by each decision-maker. Therefore, it is suggested that a population with varying socioeconomic characteristics be created and used in future studies.\",\"PeriodicalId\":389628,\"journal\":{\"name\":\"2015 International Conference on Transportation Information and Safety (ICTIS)\",\"volume\":\"12 18\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-06-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 International Conference on Transportation Information and Safety (ICTIS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICTIS.2015.7232160\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 International Conference on Transportation Information and Safety (ICTIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICTIS.2015.7232160","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Effect of within-sample choice distribution and sample size on the estimation accuracy of logit model
Within-Sample Choice Distribution and Sample size are important considerations in the estimation of logit model, but their effects on the estimation accuracy have not been systematically studied. Therefore, the objective of this paper is to provide an empirical examination to the above issues through a set of simulated choice datasets. In this paper, the utility function coefficients and alternative specific constants (ASCs) are specified as a prior. Then, assuming alternative attributes and error components follow a normal distribution, both revealed preference (RP) and stated preference (SP) synthetic choice datasets are simulated. Based on these simulated datasets, the utility coefficients and ASCs are re-estimated and compared with the original values specified as the prior. It is found that the utility coefficients can be re-estimated with reasonable accuracy, but the estimates of the ASCs are confronted with much larger errors. The Sum of Square Errors (SSEs) between the “original” and the estimated utility coefficients and ASCs using RP and SP datasets of varying sample size are calculated, plotted and the corresponding diminishing marginal return points are identified. Regarding within-sample choice distribution, study results show that, as the within-sample choice distribution becomes more balanced, the hit-ratio decreases. It appears that, when alternatives are chosen with similar frequency, choosing one alternative vs. another does not make much difference in terms of utility perceived by each decision-maker. Therefore, it is suggested that a population with varying socioeconomic characteristics be created and used in future studies.