可扩展的核逻辑回归Nyström近似:理论分析和应用,以离散选择建模

IF 5.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
José Ángel Martín-Baos , Ricardo García-Ródenas , Luis Rodriguez-Benitez , Michel Bierlaire
{"title":"可扩展的核逻辑回归Nyström近似:理论分析和应用,以离散选择建模","authors":"José Ángel Martín-Baos ,&nbsp;Ricardo García-Ródenas ,&nbsp;Luis Rodriguez-Benitez ,&nbsp;Michel Bierlaire","doi":"10.1016/j.neucom.2024.128975","DOIUrl":null,"url":null,"abstract":"<div><div>The application of kernel-based Machine Learning (ML) techniques to discrete choice modelling using large datasets often faces challenges due to memory requirements and the considerable number of parameters involved in these models. This complexity hampers the efficient training of large-scale models. This paper addresses these problems of scalability by introducing the Nyström approximation for Kernel Logistic Regression (KLR) on large datasets. The study begins by presenting a theoretical analysis in which: (i) the set of KLR solutions is characterised, (ii) an upper bound to the solution of KLR with Nyström approximation is provided, and finally (iii) a specialisation of the optimisation algorithms to Nyström KLR is described. After this, the Nyström KLR is computationally validated. Four landmark selection methods are tested, including basic uniform sampling, a <span><math><mi>k</mi></math></span>-means sampling strategy, and two non-uniform methods grounded in leverage scores. The performance of these strategies is evaluated using large-scale transport mode choice datasets and is compared with traditional methods such as Multinomial Logit (MNL) and contemporary ML techniques. The study also assesses the efficiency of various optimisation techniques for the proposed Nyström KLR model. The performance of gradient descent, Momentum, Adam, and L-BFGS-B optimisation methods is examined on these datasets. Among these strategies, the <span><math><mi>k</mi></math></span>-means Nyström KLR approach emerges as a successful solution for applying KLR to large datasets, particularly when combined with the L-BFGS-B and Adam optimisation methods. The results highlight the ability of this strategy to handle datasets exceeding 200,000 observations while maintaining robust performance.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"617 ","pages":"Article 128975"},"PeriodicalIF":5.5000,"publicationDate":"2024-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Scalable kernel logistic regression with Nyström approximation: Theoretical analysis and application to discrete choice modelling\",\"authors\":\"José Ángel Martín-Baos ,&nbsp;Ricardo García-Ródenas ,&nbsp;Luis Rodriguez-Benitez ,&nbsp;Michel Bierlaire\",\"doi\":\"10.1016/j.neucom.2024.128975\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The application of kernel-based Machine Learning (ML) techniques to discrete choice modelling using large datasets often faces challenges due to memory requirements and the considerable number of parameters involved in these models. This complexity hampers the efficient training of large-scale models. This paper addresses these problems of scalability by introducing the Nyström approximation for Kernel Logistic Regression (KLR) on large datasets. The study begins by presenting a theoretical analysis in which: (i) the set of KLR solutions is characterised, (ii) an upper bound to the solution of KLR with Nyström approximation is provided, and finally (iii) a specialisation of the optimisation algorithms to Nyström KLR is described. After this, the Nyström KLR is computationally validated. Four landmark selection methods are tested, including basic uniform sampling, a <span><math><mi>k</mi></math></span>-means sampling strategy, and two non-uniform methods grounded in leverage scores. The performance of these strategies is evaluated using large-scale transport mode choice datasets and is compared with traditional methods such as Multinomial Logit (MNL) and contemporary ML techniques. The study also assesses the efficiency of various optimisation techniques for the proposed Nyström KLR model. The performance of gradient descent, Momentum, Adam, and L-BFGS-B optimisation methods is examined on these datasets. Among these strategies, the <span><math><mi>k</mi></math></span>-means Nyström KLR approach emerges as a successful solution for applying KLR to large datasets, particularly when combined with the L-BFGS-B and Adam optimisation methods. The results highlight the ability of this strategy to handle datasets exceeding 200,000 observations while maintaining robust performance.</div></div>\",\"PeriodicalId\":19268,\"journal\":{\"name\":\"Neurocomputing\",\"volume\":\"617 \",\"pages\":\"Article 128975\"},\"PeriodicalIF\":5.5000,\"publicationDate\":\"2024-11-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neurocomputing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0925231224017466\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurocomputing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0925231224017466","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

将基于核的机器学习(ML)技术应用于使用大型数据集的离散选择建模,由于内存需求和这些模型中涉及的大量参数,通常面临挑战。这种复杂性阻碍了大规模模型的有效训练。本文通过在大数据集上引入Nyström近似核逻辑回归(KLR)来解决这些可扩展性问题。该研究首先提出了一个理论分析,其中:(i) KLR解的集合被表征,(ii)提供了Nyström近似的KLR解的上界,最后(iii)描述了Nyström KLR优化算法的专门化。在此之后,对Nyström KLR进行计算验证。本文测试了四种里程碑选择方法,包括基本均匀抽样、k-means抽样策略和基于杠杆分数的两种非均匀方法。使用大规模传输模式选择数据集对这些策略的性能进行了评估,并与传统方法(如Multinomial Logit (MNL))和当代ML技术进行了比较。该研究还评估了提出的Nyström KLR模型的各种优化技术的效率。在这些数据集上检验了梯度下降、动量、亚当和L-BFGS-B优化方法的性能。在这些策略中,k-means Nyström KLR方法是将KLR应用于大型数据集的成功解决方案,特别是与L-BFGS-B和Adam优化方法结合使用时。结果突出了该策略处理超过200,000个观测值的数据集的能力,同时保持了稳健的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Scalable kernel logistic regression with Nyström approximation: Theoretical analysis and application to discrete choice modelling
The application of kernel-based Machine Learning (ML) techniques to discrete choice modelling using large datasets often faces challenges due to memory requirements and the considerable number of parameters involved in these models. This complexity hampers the efficient training of large-scale models. This paper addresses these problems of scalability by introducing the Nyström approximation for Kernel Logistic Regression (KLR) on large datasets. The study begins by presenting a theoretical analysis in which: (i) the set of KLR solutions is characterised, (ii) an upper bound to the solution of KLR with Nyström approximation is provided, and finally (iii) a specialisation of the optimisation algorithms to Nyström KLR is described. After this, the Nyström KLR is computationally validated. Four landmark selection methods are tested, including basic uniform sampling, a k-means sampling strategy, and two non-uniform methods grounded in leverage scores. The performance of these strategies is evaluated using large-scale transport mode choice datasets and is compared with traditional methods such as Multinomial Logit (MNL) and contemporary ML techniques. The study also assesses the efficiency of various optimisation techniques for the proposed Nyström KLR model. The performance of gradient descent, Momentum, Adam, and L-BFGS-B optimisation methods is examined on these datasets. Among these strategies, the k-means Nyström KLR approach emerges as a successful solution for applying KLR to large datasets, particularly when combined with the L-BFGS-B and Adam optimisation methods. The results highlight the ability of this strategy to handle datasets exceeding 200,000 observations while maintaining robust performance.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Neurocomputing
Neurocomputing 工程技术-计算机:人工智能
CiteScore
13.10
自引率
10.00%
发文量
1382
审稿时长
70 days
期刊介绍: Neurocomputing publishes articles describing recent fundamental contributions in the field of neurocomputing. Neurocomputing theory, practice and applications are the essential topics being covered.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信