融合 GBDT 和神经网络估算点击率

Bin Zhao, Wei Cao, Jiqun Zhang, Yilong Gao, Bin Li, Fengmei Chen
{"title":"融合 GBDT 和神经网络估算点击率","authors":"Bin Zhao, Wei Cao, Jiqun Zhang, Yilong Gao, Bin Li, Fengmei Chen","doi":"10.3233/jifs-234713","DOIUrl":null,"url":null,"abstract":"Aiming at the issue that the current click-through rate prediction methods ignore the varying impacts of different input features on prediction accuracy and exhibit low accuracy when dealing with large-scale data, a click-through rate prediction method (GBIFM) which combines Gradient Boosting Decision Tree (GBDT) and Input-aware Factorization Machine (IFM) is proposed in this paper. The proposed GBIFM method employs GBDT for data processing, which can flexibly handle various types of data without the need for one-hot encoding of discrete features. An Input-aware strategy is introduced to refine the weight vector and embedding vector of each feature for different instances, adaptively learning the impact of each input vector on feature representation. Furthermore, a fully connected network is incorporated to capture high-order features in a non-linear manner, enhancing the method’s ability to express and generalize complex structured data. A comprehensive experiment is conducted on the Criteo and Avazu datasets, the results show that compared to typical methods such as DeepFM, AFM, and IFM, the proposed method GBIFM can increase the AUC value by 10% –12% and decrease the Logloss value by 6% –20%, effectively improving the accuracy of click-through rate prediction.","PeriodicalId":509313,"journal":{"name":"Journal of Intelligent & Fuzzy Systems","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Fusion of GBDT and neural network for click-through rate estimation\",\"authors\":\"Bin Zhao, Wei Cao, Jiqun Zhang, Yilong Gao, Bin Li, Fengmei Chen\",\"doi\":\"10.3233/jifs-234713\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Aiming at the issue that the current click-through rate prediction methods ignore the varying impacts of different input features on prediction accuracy and exhibit low accuracy when dealing with large-scale data, a click-through rate prediction method (GBIFM) which combines Gradient Boosting Decision Tree (GBDT) and Input-aware Factorization Machine (IFM) is proposed in this paper. The proposed GBIFM method employs GBDT for data processing, which can flexibly handle various types of data without the need for one-hot encoding of discrete features. An Input-aware strategy is introduced to refine the weight vector and embedding vector of each feature for different instances, adaptively learning the impact of each input vector on feature representation. Furthermore, a fully connected network is incorporated to capture high-order features in a non-linear manner, enhancing the method’s ability to express and generalize complex structured data. A comprehensive experiment is conducted on the Criteo and Avazu datasets, the results show that compared to typical methods such as DeepFM, AFM, and IFM, the proposed method GBIFM can increase the AUC value by 10% –12% and decrease the Logloss value by 6% –20%, effectively improving the accuracy of click-through rate prediction.\",\"PeriodicalId\":509313,\"journal\":{\"name\":\"Journal of Intelligent & Fuzzy Systems\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-03-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Intelligent & Fuzzy Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3233/jifs-234713\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Intelligent & Fuzzy Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3233/jifs-234713","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

针对目前的点击率预测方法忽视了不同输入特征对预测精度的不同影响,在处理大规模数据时表现出较低的精度这一问题,本文提出了一种结合梯度提升决策树(GBDT)和输入感知因式分解机(IFM)的点击率预测方法(GBIFM)。所提出的 GBIFM 方法采用 GBDT 进行数据处理,可灵活处理各种类型的数据,而无需对离散特征进行一次性编码。本文引入了输入感知策略,针对不同的实例细化每个特征的权重向量和嵌入向量,自适应地学习每个输入向量对特征表示的影响。此外,还加入了全连接网络,以非线性方式捕捉高阶特征,从而增强了该方法表达和概括复杂结构数据的能力。在 Criteo 和 Avazu 数据集上进行了综合实验,结果表明,与 DeepFM、AFM 和 IFM 等典型方法相比,所提出的方法 GBIFM 可以将 AUC 值提高 10% -12% 并将 Logloss 值降低 6% -20%,有效提高了点击率预测的准确性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Fusion of GBDT and neural network for click-through rate estimation
Aiming at the issue that the current click-through rate prediction methods ignore the varying impacts of different input features on prediction accuracy and exhibit low accuracy when dealing with large-scale data, a click-through rate prediction method (GBIFM) which combines Gradient Boosting Decision Tree (GBDT) and Input-aware Factorization Machine (IFM) is proposed in this paper. The proposed GBIFM method employs GBDT for data processing, which can flexibly handle various types of data without the need for one-hot encoding of discrete features. An Input-aware strategy is introduced to refine the weight vector and embedding vector of each feature for different instances, adaptively learning the impact of each input vector on feature representation. Furthermore, a fully connected network is incorporated to capture high-order features in a non-linear manner, enhancing the method’s ability to express and generalize complex structured data. A comprehensive experiment is conducted on the Criteo and Avazu datasets, the results show that compared to typical methods such as DeepFM, AFM, and IFM, the proposed method GBIFM can increase the AUC value by 10% –12% and decrease the Logloss value by 6% –20%, effectively improving the accuracy of click-through rate prediction.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信