Robin Van Oirbeek, Félix Vandervorst, Thomas Bury, Gireg Willame, Christopher Grumiau, Tim Verdonck
{"title":"Non-Differentiable Loss Function Optimization and Interaction Effect Discovery in Insurance Pricing Using the Genetic Algorithm","authors":"Robin Van Oirbeek, Félix Vandervorst, Thomas Bury, Gireg Willame, Christopher Grumiau, Tim Verdonck","doi":"10.3390/risks12050079","DOIUrl":null,"url":null,"abstract":"Insurance pricing is the process of determining the premiums that policyholders pay in exchange for insurance coverage. In order to estimate premiums, actuaries use statistical based methods, assessing various factors such as the probability of certain events occurring (like accidents or damages), where the Generalized Linear Models (GLMs) are the industry standard method. Traditional GLM approaches face limitations due to non-differentiable loss functions and expansive variable spaces, including both main and interaction terms. In this study, we address the challenge of selecting relevant variables for GLMs used in non-life insurance pricing both for frequency or severity analyses, amidst an increasing volume of data and variables. We propose a novel application of the Genetic Algorithm (GA) to efficiently identify pertinent main and interaction effects in GLMs, even in scenarios with a high variable count and diverse loss functions. Our approach uniquely aligns GLM predictions with those of black box machine learning models, enhancing their interpretability and reliability. Using a publicly available non-life motor data set, we demonstrate the GA’s effectiveness by comparing its selected GLM with a Gradient Boosted Machine (GBM) model. The results show a strong consistency between the main and interaction terms identified by GA for the GLM and those revealed in the GBM analysis, highlighting the potential of our method to refine and improve pricing models in the insurance sector.","PeriodicalId":21282,"journal":{"name":"Risks","volume":"52 1","pages":""},"PeriodicalIF":2.0000,"publicationDate":"2024-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Risks","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/risks12050079","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BUSINESS, FINANCE","Score":null,"Total":0}
引用次数: 0
Abstract
Insurance pricing is the process of determining the premiums that policyholders pay in exchange for insurance coverage. In order to estimate premiums, actuaries use statistical based methods, assessing various factors such as the probability of certain events occurring (like accidents or damages), where the Generalized Linear Models (GLMs) are the industry standard method. Traditional GLM approaches face limitations due to non-differentiable loss functions and expansive variable spaces, including both main and interaction terms. In this study, we address the challenge of selecting relevant variables for GLMs used in non-life insurance pricing both for frequency or severity analyses, amidst an increasing volume of data and variables. We propose a novel application of the Genetic Algorithm (GA) to efficiently identify pertinent main and interaction effects in GLMs, even in scenarios with a high variable count and diverse loss functions. Our approach uniquely aligns GLM predictions with those of black box machine learning models, enhancing their interpretability and reliability. Using a publicly available non-life motor data set, we demonstrate the GA’s effectiveness by comparing its selected GLM with a Gradient Boosted Machine (GBM) model. The results show a strong consistency between the main and interaction terms identified by GA for the GLM and those revealed in the GBM analysis, highlighting the potential of our method to refine and improve pricing models in the insurance sector.