Nicola Ortelli , Matthieu de Lapparent , Michel Bierlaire
{"title":"离散选择模型的重采样估计","authors":"Nicola Ortelli , Matthieu de Lapparent , Michel Bierlaire","doi":"10.1016/j.jocm.2023.100467","DOIUrl":null,"url":null,"abstract":"<div><p>In the context of discrete choice modeling, the extraction of potential behavioral insights from large datasets is often limited by the poor scalability of maximum likelihood estimation. This paper proposes a simple and fast dataset-reduction method that is specifically designed to preserve the richness of observations originally present in a dataset, while reducing the computational complexity of the estimation process. Our approach, called LSH-DR, leverages locality-sensitive hashing to create homogeneous clusters, from which representative observations are then sampled and weighted. We demonstrate the efficacy of our approach by applying it on a real-world mode choice dataset: the obtained results show that the samples generated by LSH-DR allow for substantial savings in estimation time while preserving estimation efficiency at little cost.</p></div>","PeriodicalId":46863,"journal":{"name":"Journal of Choice Modelling","volume":"50 ","pages":"Article 100467"},"PeriodicalIF":2.8000,"publicationDate":"2024-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1755534523000684/pdfft?md5=1bf006ed1264b0459140eeab28ae0e10&pid=1-s2.0-S1755534523000684-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Resampling estimation of discrete choice models\",\"authors\":\"Nicola Ortelli , Matthieu de Lapparent , Michel Bierlaire\",\"doi\":\"10.1016/j.jocm.2023.100467\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>In the context of discrete choice modeling, the extraction of potential behavioral insights from large datasets is often limited by the poor scalability of maximum likelihood estimation. This paper proposes a simple and fast dataset-reduction method that is specifically designed to preserve the richness of observations originally present in a dataset, while reducing the computational complexity of the estimation process. Our approach, called LSH-DR, leverages locality-sensitive hashing to create homogeneous clusters, from which representative observations are then sampled and weighted. We demonstrate the efficacy of our approach by applying it on a real-world mode choice dataset: the obtained results show that the samples generated by LSH-DR allow for substantial savings in estimation time while preserving estimation efficiency at little cost.</p></div>\",\"PeriodicalId\":46863,\"journal\":{\"name\":\"Journal of Choice Modelling\",\"volume\":\"50 \",\"pages\":\"Article 100467\"},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2024-01-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S1755534523000684/pdfft?md5=1bf006ed1264b0459140eeab28ae0e10&pid=1-s2.0-S1755534523000684-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Choice Modelling\",\"FirstCategoryId\":\"96\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1755534523000684\",\"RegionNum\":3,\"RegionCategory\":\"经济学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ECONOMICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Choice Modelling","FirstCategoryId":"96","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1755534523000684","RegionNum":3,"RegionCategory":"经济学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ECONOMICS","Score":null,"Total":0}
In the context of discrete choice modeling, the extraction of potential behavioral insights from large datasets is often limited by the poor scalability of maximum likelihood estimation. This paper proposes a simple and fast dataset-reduction method that is specifically designed to preserve the richness of observations originally present in a dataset, while reducing the computational complexity of the estimation process. Our approach, called LSH-DR, leverages locality-sensitive hashing to create homogeneous clusters, from which representative observations are then sampled and weighted. We demonstrate the efficacy of our approach by applying it on a real-world mode choice dataset: the obtained results show that the samples generated by LSH-DR allow for substantial savings in estimation time while preserving estimation efficiency at little cost.