Predictive model for a product without history using LightGBM. Pricing model for a new product

Anastasiia Kriuchkova, Varvara Toloknova, Svitlana Drin
{"title":"Predictive model for a product without history using LightGBM. Pricing model for a new product","authors":"Anastasiia Kriuchkova, Varvara Toloknova, Svitlana Drin","doi":"10.18523/2617-7080620236-13","DOIUrl":null,"url":null,"abstract":"The article focuses on developing a predictive product pricing model using LightGBM. Also, the goal was to adapt the LightGBM method for regression problems and, especially, in the problems of forecasting the price of a product without history, that is, with a cold start.The article contains the necessary concepts to understand the working principles of the light gradient boosting machine, such as decision trees, boosting, random forests, gradient descent, GBM (Gradient Boosting Machine), GBDT (Gradient Boosting Decision Trees). The article provides detailed insights into the algorithms used for identifying split points, with a focus on the histogram-based approach.LightGBM enhances the gradient boosting algorithm by introducing an automated feature selection mechanism and giving special attention to boosting instances characterized by more substantial gradients. This can lead to significantly faster training and improved prediction performance. The Gradient-based One-Side Sampling (GOSS) and Exclusive Feature Bundling (EFB) techniques used as enhancements to LightGBM are vividly described. The article presents the algorithms for both techniques and the complete LightGBM algorithm.This work contains an experimental result. To test the lightGBM, a real dataset of one Japanese C2C marketplace from the Kaggle site was taken. In the practical part, a performance comparison between LightGBM and XGBoost (Extreme Gradient Boosting Machine) was performed. As a result, only a slight increase in estimation performance (RMSE, MAE, R-squard) was found by applying LightGBM over XGBoost, however, there exists a notable contrast in the training procedure’s time efficiency. LightGBM exhibits an almost threefold increase in speed compared to XGBoost, making it a superior choice for handling extensive datasets.This article is dedicated to the development and implementation of machine learning models for product pricing using LightGBM. The incorporation of automatic feature selection, a focus on highgradient examples, and techniques like GOSS and EFB demonstrate the model’s versatility and efficiency. Such predictive models will help companies improve their pricing models for a new product. The speed of obtaining a forecast for each element of the database is extremely relevant at a time of rapid data accumulation.","PeriodicalId":404986,"journal":{"name":"Mohyla Mathematical Journal","volume":" 42","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Mohyla Mathematical Journal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18523/2617-7080620236-13","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The article focuses on developing a predictive product pricing model using LightGBM. Also, the goal was to adapt the LightGBM method for regression problems and, especially, in the problems of forecasting the price of a product without history, that is, with a cold start.The article contains the necessary concepts to understand the working principles of the light gradient boosting machine, such as decision trees, boosting, random forests, gradient descent, GBM (Gradient Boosting Machine), GBDT (Gradient Boosting Decision Trees). The article provides detailed insights into the algorithms used for identifying split points, with a focus on the histogram-based approach.LightGBM enhances the gradient boosting algorithm by introducing an automated feature selection mechanism and giving special attention to boosting instances characterized by more substantial gradients. This can lead to significantly faster training and improved prediction performance. The Gradient-based One-Side Sampling (GOSS) and Exclusive Feature Bundling (EFB) techniques used as enhancements to LightGBM are vividly described. The article presents the algorithms for both techniques and the complete LightGBM algorithm.This work contains an experimental result. To test the lightGBM, a real dataset of one Japanese C2C marketplace from the Kaggle site was taken. In the practical part, a performance comparison between LightGBM and XGBoost (Extreme Gradient Boosting Machine) was performed. As a result, only a slight increase in estimation performance (RMSE, MAE, R-squard) was found by applying LightGBM over XGBoost, however, there exists a notable contrast in the training procedure’s time efficiency. LightGBM exhibits an almost threefold increase in speed compared to XGBoost, making it a superior choice for handling extensive datasets.This article is dedicated to the development and implementation of machine learning models for product pricing using LightGBM. The incorporation of automatic feature selection, a focus on highgradient examples, and techniques like GOSS and EFB demonstrate the model’s versatility and efficiency. Such predictive models will help companies improve their pricing models for a new product. The speed of obtaining a forecast for each element of the database is extremely relevant at a time of rapid data accumulation.
使用 LightGBM 建立无历史记录产品的预测模型。新产品定价模型
文章的重点是利用 LightGBM 建立一个预测产品定价模型。文章包含理解光梯度提升机工作原理的必要概念,如决策树、提升、随机森林、梯度下降、GBM(梯度提升机)、GBDT(梯度提升决策树)。LightGBM 引入了自动特征选择机制,并特别关注梯度较大的提升实例,从而增强了梯度提升算法。这可以大大加快训练速度,提高预测性能。文章生动地描述了作为 LightGBM 增强技术的基于梯度的单侧采样(Gradient-based One-Side Sampling,GOSS)和专属特征捆绑(Exclusive Feature Bundling,EFB)技术。文章介绍了这两种技术的算法和完整的 LightGBM 算法。为了测试 LightGBM,我们从 Kaggle 网站上获取了一个日本 C2C 市场的真实数据集。在实际操作部分,对 LightGBM 和 XGBoost(极梯度提升机)进行了性能比较。结果发现,与 XGBoost 相比,LightGBM 的估计性能(RMSE、MAE、R-squard)仅略有提高,但在训练过程的时间效率方面却存在明显的对比。与 XGBoost 相比,LightGBM 的速度几乎提高了三倍,使其成为处理大量数据集的上佳选择。自动特征选择、对高梯度示例的关注以及 GOSS 和 EFB 等技术的结合展示了该模型的多功能性和高效性。这种预测模型将帮助公司改进新产品的定价模型。在数据快速积累的今天,对数据库中每个元素进行预测的速度是极为重要的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信