{"title":"基于XGBoost的保险欺诈检测","authors":"","doi":"10.25236/ajcis.2023.060808","DOIUrl":null,"url":null,"abstract":"This research conducted a comprehensive study on predicting customer car insurance claims using Gradient Boosting Decision Tree (GBDT) and XGBoost models. The process included data exploration, feature engineering, model evaluation, and parameter tuning. The dataset was explored based on variable types and missing values, and further processed through mean encoding and outlier removal. Date features were also manipulated to create more meaningful features. Two models, GBDT and XGBoost, were trained and evaluated based on their AUC (Area Under the Curve) values. Both models demonstrated good predictive power, with GBDT slightly outperforming XGBoost. The results of this study provide valuable insights for predicting insurance claims, offering significant implications for further research and practical applications.","PeriodicalId":387664,"journal":{"name":"Academic Journal of Computing & Information Science","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Insurance Fraud Detection Based on XGBoost\",\"authors\":\"\",\"doi\":\"10.25236/ajcis.2023.060808\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This research conducted a comprehensive study on predicting customer car insurance claims using Gradient Boosting Decision Tree (GBDT) and XGBoost models. The process included data exploration, feature engineering, model evaluation, and parameter tuning. The dataset was explored based on variable types and missing values, and further processed through mean encoding and outlier removal. Date features were also manipulated to create more meaningful features. Two models, GBDT and XGBoost, were trained and evaluated based on their AUC (Area Under the Curve) values. Both models demonstrated good predictive power, with GBDT slightly outperforming XGBoost. The results of this study provide valuable insights for predicting insurance claims, offering significant implications for further research and practical applications.\",\"PeriodicalId\":387664,\"journal\":{\"name\":\"Academic Journal of Computing & Information Science\",\"volume\":\"15 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Academic Journal of Computing & Information Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.25236/ajcis.2023.060808\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Academic Journal of Computing & Information Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.25236/ajcis.2023.060808","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
This research conducted a comprehensive study on predicting customer car insurance claims using Gradient Boosting Decision Tree (GBDT) and XGBoost models. The process included data exploration, feature engineering, model evaluation, and parameter tuning. The dataset was explored based on variable types and missing values, and further processed through mean encoding and outlier removal. Date features were also manipulated to create more meaningful features. Two models, GBDT and XGBoost, were trained and evaluated based on their AUC (Area Under the Curve) values. Both models demonstrated good predictive power, with GBDT slightly outperforming XGBoost. The results of this study provide valuable insights for predicting insurance claims, offering significant implications for further research and practical applications.