A Generalized Linear Model and Machine Learning Approach for Predicting the Frequency and Severity of Cargo Insurance in Thailand’s Border Trade Context
{"title":"A Generalized Linear Model and Machine Learning Approach for Predicting the Frequency and Severity of Cargo Insurance in Thailand’s Border Trade Context","authors":"Praiya Panjee, Sataporn Amornsawadwatana","doi":"10.3390/risks12020025","DOIUrl":null,"url":null,"abstract":"The study compares model approaches in predictive modeling for claim frequency and severity within the cross-border cargo insurance domain. The aim is to identify the optimal model approach between generalized linear models (GLMs) and advanced machine learning techniques. Evaluations focus on mean absolute error (MAE) and root mean squared error (RMSE) metrics to comprehensively assess predictive performance. For frequency prediction, extreme gradient boosting (XGBoost) demonstrates the lowest MAE, indicating higher accuracy compared to gradient boosting machines (GBMs) and a generalized linear model (Poisson). Despite XGBoost’s lower MAE, it shows higher RMSE values, suggesting a broader error spread and larger magnitudes compared to gradient boosting machines (GBMs) and a generalized linear model (Poisson). Conversely, the generalized linear model (Poisson) showcases the best RMSE values, indicating tighter clustering and smaller error magnitudes, despite a slightly higher MAE. For severity prediction, extreme gradient boosting (XGBoost) displays the lowest MAE, implying better accuracy. However, it exhibits a higher RMSE, indicating wider error dispersion compared to a generalized linear model (Gamma). In contrast, a generalized linear model (Gamma) demonstrates the lowest RMSE, portraying tighter clustering and smaller error magnitudes despite a higher MAE. In conclusion, extreme gradient boosting (XGBoost) stands out in mean absolute error (MAE) for both frequency and severity prediction, showcasing superior accuracy. However, a generalized linear model (Gamma) offers a balance between accuracy and error magnitude, and its performance outperforms extreme gradient boosting (XGBoost) and gradient boosting machines (GBMs) in terms of RMSE metrics, with a slightly higher MAE. These findings empower insurance companies to enhance risk assessment processes, set suitable premiums, manage reserves, and accurately forecast claim occurrences, contributing to competitive pricing for clients while ensuring profitability. For cross-border trade entities, such as trucking companies and cargo owners, these insights aid in improved risk management and potential cost savings by enabling more reasonable insurance premiums based on accurate predictive claims from insurance companies.","PeriodicalId":21282,"journal":{"name":"Risks","volume":"11 1","pages":""},"PeriodicalIF":2.0000,"publicationDate":"2024-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Risks","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/risks12020025","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BUSINESS, FINANCE","Score":null,"Total":0}
引用次数: 0
Abstract
The study compares model approaches in predictive modeling for claim frequency and severity within the cross-border cargo insurance domain. The aim is to identify the optimal model approach between generalized linear models (GLMs) and advanced machine learning techniques. Evaluations focus on mean absolute error (MAE) and root mean squared error (RMSE) metrics to comprehensively assess predictive performance. For frequency prediction, extreme gradient boosting (XGBoost) demonstrates the lowest MAE, indicating higher accuracy compared to gradient boosting machines (GBMs) and a generalized linear model (Poisson). Despite XGBoost’s lower MAE, it shows higher RMSE values, suggesting a broader error spread and larger magnitudes compared to gradient boosting machines (GBMs) and a generalized linear model (Poisson). Conversely, the generalized linear model (Poisson) showcases the best RMSE values, indicating tighter clustering and smaller error magnitudes, despite a slightly higher MAE. For severity prediction, extreme gradient boosting (XGBoost) displays the lowest MAE, implying better accuracy. However, it exhibits a higher RMSE, indicating wider error dispersion compared to a generalized linear model (Gamma). In contrast, a generalized linear model (Gamma) demonstrates the lowest RMSE, portraying tighter clustering and smaller error magnitudes despite a higher MAE. In conclusion, extreme gradient boosting (XGBoost) stands out in mean absolute error (MAE) for both frequency and severity prediction, showcasing superior accuracy. However, a generalized linear model (Gamma) offers a balance between accuracy and error magnitude, and its performance outperforms extreme gradient boosting (XGBoost) and gradient boosting machines (GBMs) in terms of RMSE metrics, with a slightly higher MAE. These findings empower insurance companies to enhance risk assessment processes, set suitable premiums, manage reserves, and accurately forecast claim occurrences, contributing to competitive pricing for clients while ensuring profitability. For cross-border trade entities, such as trucking companies and cargo owners, these insights aid in improved risk management and potential cost savings by enabling more reasonable insurance premiums based on accurate predictive claims from insurance companies.
本研究比较了跨境货物保险领域索赔频率和严重程度预测建模的模型方法。目的是在广义线性模型(GLM)和先进的机器学习技术之间确定最佳模型方法。评估重点是平均绝对误差(MAE)和均方根误差(RMSE)指标,以全面评估预测性能。在频率预测方面,极端梯度提升(XGBoost)的平均绝对误差(MAE)最低,这表明与梯度提升机(GBMs)和广义线性模型(Poisson)相比,其准确性更高。尽管 XGBoost 的 MAE 值较低,但它的 RMSE 值却较高,表明与梯度提升机器 (GBM) 和广义线性模型 (Poisson) 相比,它的误差范围更广,误差幅度更大。相反,广义线性模型(泊松)的 RMSE 值最好,表明聚类更紧密,误差幅度更小,尽管 MAE 稍高。在严重性预测方面,极梯度提升模型(XGBoost)的 MAE 值最低,表明准确性更高。不过,与广义线性模型(Gamma)相比,它的均方根误差(RMSE)更高,表明误差分散度更大。相比之下,广义线性模型(Gamma)的 RMSE 值最低,尽管 MAE 值较高,但聚类更紧密,误差幅度更小。总之,极端梯度提升(XGBoost)在频率和严重性预测的平均绝对误差(MAE)方面表现突出,显示出卓越的准确性。然而,广义线性模型(Gamma)在准确性和误差幅度之间取得了平衡,就 RMSE 指标而言,其性能优于极梯度提升(XGBoost)和梯度提升机(GBMs),但 MAE 略高。这些发现使保险公司能够加强风险评估流程、设定合适的保费、管理准备金并准确预测索赔发生率,从而在确保盈利的同时为客户提供有竞争力的定价。对于卡车运输公司和货主等跨境贸易实体来说,这些见解有助于改善风险管理和潜在的成本节约,因为保险公司可以根据准确的索赔预测收取更合理的保险费。