可扩展的核逻辑回归Nyström近似：理论分析和应用，以离散选择建模

IF 5.5 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neurocomputing Pub Date : 2024-11-23 DOI:10.1016/j.neucom.2024.128975

José Ángel Martín-Baos , Ricardo García-Ródenas , Luis Rodriguez-Benitez , Michel Bierlaire

{"title":"可扩展的核逻辑回归Nyström近似：理论分析和应用，以离散选择建模","authors":"José Ángel Martín-Baos , Ricardo García-Ródenas , Luis Rodriguez-Benitez , Michel Bierlaire","doi":"10.1016/j.neucom.2024.128975","DOIUrl":null,"url":null,"abstract":"<div><div>The application of kernel-based Machine Learning (ML) techniques to discrete choice modelling using large datasets often faces challenges due to memory requirements and the considerable number of parameters involved in these models. This complexity hampers the efficient training of large-scale models. This paper addresses these problems of scalability by introducing the Nyström approximation for Kernel Logistic Regression (KLR) on large datasets. The study begins by presenting a theoretical analysis in which: (i) the set of KLR solutions is characterised, (ii) an upper bound to the solution of KLR with Nyström approximation is provided, and finally (iii) a specialisation of the optimisation algorithms to Nyström KLR is described. After this, the Nyström KLR is computationally validated. Four landmark selection methods are tested, including basic uniform sampling, a <span><math><mi>k</mi></math></span>-means sampling strategy, and two non-uniform methods grounded in leverage scores. The performance of these strategies is evaluated using large-scale transport mode choice datasets and is compared with traditional methods such as Multinomial Logit (MNL) and contemporary ML techniques. The study also assesses the efficiency of various optimisation techniques for the proposed Nyström KLR model. The performance of gradient descent, Momentum, Adam, and L-BFGS-B optimisation methods is examined on these datasets. Among these strategies, the <span><math><mi>k</mi></math></span>-means Nyström KLR approach emerges as a successful solution for applying KLR to large datasets, particularly when combined with the L-BFGS-B and Adam optimisation methods. The results highlight the ability of this strategy to handle datasets exceeding 200,000 observations while maintaining robust performance.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"617 ","pages":"Article 128975"},"PeriodicalIF":5.5000,"publicationDate":"2024-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Scalable kernel logistic regression with Nyström approximation: Theoretical analysis and application to discrete choice modelling\",\"authors\":\"José Ángel Martín-Baos , Ricardo García-Ródenas , Luis Rodriguez-Benitez , Michel Bierlaire\",\"doi\":\"10.1016/j.neucom.2024.128975\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The application of kernel-based Machine Learning (ML) techniques to discrete choice modelling using large datasets often faces challenges due to memory requirements and the considerable number of parameters involved in these models. This complexity hampers the efficient training of large-scale models. This paper addresses these problems of scalability by introducing the Nyström approximation for Kernel Logistic Regression (KLR) on large datasets. The study begins by presenting a theoretical analysis in which: (i) the set of KLR solutions is characterised, (ii) an upper bound to the solution of KLR with Nyström approximation is provided, and finally (iii) a specialisation of the optimisation algorithms to Nyström KLR is described. After this, the Nyström KLR is computationally validated. Four landmark selection methods are tested, including basic uniform sampling, a <span><math><mi>k</mi></math></span>-means sampling strategy, and two non-uniform methods grounded in leverage scores. The performance of these strategies is evaluated using large-scale transport mode choice datasets and is compared with traditional methods such as Multinomial Logit (MNL) and contemporary ML techniques. The study also assesses the efficiency of various optimisation techniques for the proposed Nyström KLR model. The performance of gradient descent, Momentum, Adam, and L-BFGS-B optimisation methods is examined on these datasets. Among these strategies, the <span><math><mi>k</mi></math></span>-means Nyström KLR approach emerges as a successful solution for applying KLR to large datasets, particularly when combined with the L-BFGS-B and Adam optimisation methods. The results highlight the ability of this strategy to handle datasets exceeding 200,000 observations while maintaining robust performance.</div></div>\",\"PeriodicalId\":19268,\"journal\":{\"name\":\"Neurocomputing\",\"volume\":\"617 \",\"pages\":\"Article 128975\"},\"PeriodicalIF\":5.5000,\"publicationDate\":\"2024-11-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neurocomputing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0925231224017466\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurocomputing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0925231224017466","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

将基于核的机器学习（ML）技术应用于使用大型数据集的离散选择建模，由于内存需求和这些模型中涉及的大量参数，通常面临挑战。这种复杂性阻碍了大规模模型的有效训练。本文通过在大数据集上引入Nyström近似核逻辑回归（KLR）来解决这些可扩展性问题。该研究首先提出了一个理论分析，其中：(i) KLR解的集合被表征，（ii）提供了Nyström近似的KLR解的上界，最后（iii）描述了Nyström KLR优化算法的专门化。在此之后，对Nyström KLR进行计算验证。本文测试了四种里程碑选择方法，包括基本均匀抽样、k-means抽样策略和基于杠杆分数的两种非均匀方法。使用大规模传输模式选择数据集对这些策略的性能进行了评估，并与传统方法（如Multinomial Logit (MNL)）和当代ML技术进行了比较。该研究还评估了提出的Nyström KLR模型的各种优化技术的效率。在这些数据集上检验了梯度下降、动量、亚当和L-BFGS-B优化方法的性能。在这些策略中，k-means Nyström KLR方法是将KLR应用于大型数据集的成功解决方案，特别是与L-BFGS-B和Adam优化方法结合使用时。结果突出了该策略处理超过200,000个观测值的数据集的能力，同时保持了稳健的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Scalable kernel logistic regression with Nyström approximation: Theoretical analysis and application to discrete choice modelling

The application of kernel-based Machine Learning (ML) techniques to discrete choice modelling using large datasets often faces challenges due to memory requirements and the considerable number of parameters involved in these models. This complexity hampers the efficient training of large-scale models. This paper addresses these problems of scalability by introducing the Nyström approximation for Kernel Logistic Regression (KLR) on large datasets. The study begins by presenting a theoretical analysis in which: (i) the set of KLR solutions is characterised, (ii) an upper bound to the solution of KLR with Nyström approximation is provided, and finally (iii) a specialisation of the optimisation algorithms to Nyström KLR is described. After this, the Nyström KLR is computationally validated. Four landmark selection methods are tested, including basic uniform sampling, a

k

-means sampling strategy, and two non-uniform methods grounded in leverage scores. The performance of these strategies is evaluated using large-scale transport mode choice datasets and is compared with traditional methods such as Multinomial Logit (MNL) and contemporary ML techniques. The study also assesses the efficiency of various optimisation techniques for the proposed Nyström KLR model. The performance of gradient descent, Momentum, Adam, and L-BFGS-B optimisation methods is examined on these datasets. Among these strategies, the

k

-means Nyström KLR approach emerges as a successful solution for applying KLR to large datasets, particularly when combined with the L-BFGS-B and Adam optimisation methods. The results highlight the ability of this strategy to handle datasets exceeding 200,000 observations while maintaining robust performance.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Neurocomputing 工程技术-计算机：人工智能

CiteScore

13.10

自引率

10.00%

发文量

1382

审稿时长

70 days

期刊介绍： Neurocomputing publishes articles describing recent fundamental contributions in the field of neurocomputing. Neurocomputing theory, practice and applications are the essential topics being covered.