{"title":"Predictive analytics of insurance claims using multivariate decision trees","authors":"Zhiyu Quan, Emiliano A. Valdez","doi":"10.1515/demo-2018-0022","DOIUrl":null,"url":null,"abstract":"Abstract Because of its many advantages, the use of decision trees has become an increasingly popular alternative predictive tool for building classification and regression models. Its origins date back for about five decades where the algorithm can be broadly described by repeatedly partitioning the regions of the explanatory variables and thereby creating a tree-based model for predicting the response. Innovations to the original methods, such as random forests and gradient boosting, have further improved the capabilities of using decision trees as a predictive model. In addition, the extension of using decision trees with multivariate response variables started to develop and it is the purpose of this paper to apply multivariate tree models to insurance claims data with correlated responses. This extension to multivariate response variables inherits several advantages of the univariate decision tree models such as distribution-free feature, ability to rank essential explanatory variables, and high predictive accuracy, to name a few. To illustrate the approach, we analyze a dataset drawn from the Wisconsin Local Government Property Insurance Fund (LGPIF)which offers multi-line insurance coverage of property, motor vehicle, and contractors’ equipments.With multivariate tree models, we are able to capture the inherent relationship among the response variables and we find that the marginal predictive model based on multivariate trees is an improvement in prediction accuracy from that based on simply the univariate trees.","PeriodicalId":43690,"journal":{"name":"Dependence Modeling","volume":"6 1","pages":"377 - 407"},"PeriodicalIF":0.8000,"publicationDate":"2018-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/demo-2018-0022","citationCount":"29","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Dependence Modeling","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1515/demo-2018-0022","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 29
Abstract
Abstract Because of its many advantages, the use of decision trees has become an increasingly popular alternative predictive tool for building classification and regression models. Its origins date back for about five decades where the algorithm can be broadly described by repeatedly partitioning the regions of the explanatory variables and thereby creating a tree-based model for predicting the response. Innovations to the original methods, such as random forests and gradient boosting, have further improved the capabilities of using decision trees as a predictive model. In addition, the extension of using decision trees with multivariate response variables started to develop and it is the purpose of this paper to apply multivariate tree models to insurance claims data with correlated responses. This extension to multivariate response variables inherits several advantages of the univariate decision tree models such as distribution-free feature, ability to rank essential explanatory variables, and high predictive accuracy, to name a few. To illustrate the approach, we analyze a dataset drawn from the Wisconsin Local Government Property Insurance Fund (LGPIF)which offers multi-line insurance coverage of property, motor vehicle, and contractors’ equipments.With multivariate tree models, we are able to capture the inherent relationship among the response variables and we find that the marginal predictive model based on multivariate trees is an improvement in prediction accuracy from that based on simply the univariate trees.
期刊介绍:
The journal Dependence Modeling aims at providing a medium for exchanging results and ideas in the area of multivariate dependence modeling. It is an open access fully peer-reviewed journal providing the readers with free, instant, and permanent access to all content worldwide. Dependence Modeling is listed by Web of Science (Emerging Sources Citation Index), Scopus, MathSciNet and Zentralblatt Math. The journal presents different types of articles: -"Research Articles" on fundamental theoretical aspects, as well as on significant applications in science, engineering, economics, finance, insurance and other fields. -"Review Articles" which present the existing literature on the specific topic from new perspectives. -"Interview articles" limited to two papers per year, covering interviews with milestone personalities in the field of Dependence Modeling. The journal topics include (but are not limited to): -Copula methods -Multivariate distributions -Estimation and goodness-of-fit tests -Measures of association -Quantitative risk management -Risk measures and stochastic orders -Time series -Environmental sciences -Computational methods and software -Extreme-value theory -Limit laws -Mass Transportations