Data Sets Modeling and Frequency Prediction via Machine Learning and Neural Network

Ziqi Zhang
{"title":"Data Sets Modeling and Frequency Prediction via Machine Learning and Neural Network","authors":"Ziqi Zhang","doi":"10.1109/ICESIT53460.2021.9696532","DOIUrl":null,"url":null,"abstract":"In recent years, generalized linear models have been widely used in auto insurance pricing, and some research results show that machine learning is better than generalized linear models in some aspects, but these results are only based on a single data set. In order to more comprehensively compare the effects of generalized linear models and machine learning methods on the problem of car insurance claim frequency prediction, a comparative test was carried out on 7 car insurance data sets, including deep learning, random forest, support vector machine, XGboost and other machine learning methods; Based on the same training set, establish different generalized linear models to predict the frequency of claims, select the best generalized linear model according to the minimum information criterion (AIC); obtain the best machine learning parameters and models through cross-validation tuning. The research results show that the prediction effect of XGboost on all data sets is consistently better than the generalized linear model; for some data sets with more independent variables and strong correlation between variables, the prediction effects of neural networks, deep learning and random forests Better than generalized linear models.","PeriodicalId":164745,"journal":{"name":"2021 IEEE International Conference on Emergency Science and Information Technology (ICESIT)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Emergency Science and Information Technology (ICESIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICESIT53460.2021.9696532","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

In recent years, generalized linear models have been widely used in auto insurance pricing, and some research results show that machine learning is better than generalized linear models in some aspects, but these results are only based on a single data set. In order to more comprehensively compare the effects of generalized linear models and machine learning methods on the problem of car insurance claim frequency prediction, a comparative test was carried out on 7 car insurance data sets, including deep learning, random forest, support vector machine, XGboost and other machine learning methods; Based on the same training set, establish different generalized linear models to predict the frequency of claims, select the best generalized linear model according to the minimum information criterion (AIC); obtain the best machine learning parameters and models through cross-validation tuning. The research results show that the prediction effect of XGboost on all data sets is consistently better than the generalized linear model; for some data sets with more independent variables and strong correlation between variables, the prediction effects of neural networks, deep learning and random forests Better than generalized linear models.
基于机器学习和神经网络的数据集建模和频率预测
近年来,广义线性模型在车险定价中得到了广泛的应用,一些研究结果表明,机器学习在某些方面优于广义线性模型,但这些结果仅基于单个数据集。为了更全面地比较广义线性模型和机器学习方法在车险理赔频次预测问题上的效果,在包括深度学习、随机森林、支持向量机、XGboost等机器学习方法在内的7个车险数据集上进行了对比测试;在同一训练集的基础上,建立不同的广义线性模型来预测索赔频率,根据最小信息准则(AIC)选择最佳广义线性模型;通过交叉验证调优获得最佳机器学习参数和模型。研究结果表明,XGboost对所有数据集的预测效果均优于广义线性模型;对于一些自变量较多、变量之间相关性较强的数据集,神经网络、深度学习和随机森林的预测效果优于广义线性模型。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信