I Nyoman Mahayasa Adiputra, Paweena Wanchai, Pei-Chun Lin
{"title":"利用基于表生成对抗网络(GAN)的混合采样方法和代价敏感学习优化客户流失预测。","authors":"I Nyoman Mahayasa Adiputra, Paweena Wanchai, Pei-Chun Lin","doi":"10.7717/peerj-cs.2949","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Imbalanced and overlapped data in customer churn prediction significantly impact classification results. Various sampling and hybrid sampling methods have demonstrated effectiveness in addressing these issues. However, these methods have not performed well with classical machine learning algorithms.</p><p><strong>Methods: </strong>To optimize the performance of classical machine learning on customer churn prediction tasks, this study introduces an extension framework called CostLearnGAN, a tabular generative adversarial network (GAN)-hybrid sampling method, and cost-sensitive Learning. Utilizing a cost-sensitive learning perspective, this research aims to enhance the performance of several classical machine learning algorithms in customer churn prediction tasks. Based on the experimental results classical machine learning algorithms exhibit shorter execution times, making them suitable for predicting churn in large customer bases.</p><p><strong>Results: </strong>This study conducted an experiment with six comparative sampling methods, six datasets, and three machine learning algorithms. The results show that CostLearnGAN achieved a satisfying result across all evaluation metrics with a 1.44 average mean rank score. Additionally, this study provided a robustness measurement for algorithms, demonstrating that CostLearnGAN outperforms other sampling methods in improving the performance of classical machine learning models with a 5.68 robustness value on average.</p>","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"11 ","pages":"e2949"},"PeriodicalIF":3.5000,"publicationDate":"2025-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12193428/pdf/","citationCount":"0","resultStr":"{\"title\":\"Optimized customer churn prediction using tabular generative adversarial network (GAN)-based hybrid sampling method and cost-sensitive learning.\",\"authors\":\"I Nyoman Mahayasa Adiputra, Paweena Wanchai, Pei-Chun Lin\",\"doi\":\"10.7717/peerj-cs.2949\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Imbalanced and overlapped data in customer churn prediction significantly impact classification results. Various sampling and hybrid sampling methods have demonstrated effectiveness in addressing these issues. However, these methods have not performed well with classical machine learning algorithms.</p><p><strong>Methods: </strong>To optimize the performance of classical machine learning on customer churn prediction tasks, this study introduces an extension framework called CostLearnGAN, a tabular generative adversarial network (GAN)-hybrid sampling method, and cost-sensitive Learning. Utilizing a cost-sensitive learning perspective, this research aims to enhance the performance of several classical machine learning algorithms in customer churn prediction tasks. Based on the experimental results classical machine learning algorithms exhibit shorter execution times, making them suitable for predicting churn in large customer bases.</p><p><strong>Results: </strong>This study conducted an experiment with six comparative sampling methods, six datasets, and three machine learning algorithms. The results show that CostLearnGAN achieved a satisfying result across all evaluation metrics with a 1.44 average mean rank score. Additionally, this study provided a robustness measurement for algorithms, demonstrating that CostLearnGAN outperforms other sampling methods in improving the performance of classical machine learning models with a 5.68 robustness value on average.</p>\",\"PeriodicalId\":54224,\"journal\":{\"name\":\"PeerJ Computer Science\",\"volume\":\"11 \",\"pages\":\"e2949\"},\"PeriodicalIF\":3.5000,\"publicationDate\":\"2025-06-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12193428/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"PeerJ Computer Science\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.7717/peerj-cs.2949\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"PeerJ Computer Science","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.7717/peerj-cs.2949","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Optimized customer churn prediction using tabular generative adversarial network (GAN)-based hybrid sampling method and cost-sensitive learning.
Background: Imbalanced and overlapped data in customer churn prediction significantly impact classification results. Various sampling and hybrid sampling methods have demonstrated effectiveness in addressing these issues. However, these methods have not performed well with classical machine learning algorithms.
Methods: To optimize the performance of classical machine learning on customer churn prediction tasks, this study introduces an extension framework called CostLearnGAN, a tabular generative adversarial network (GAN)-hybrid sampling method, and cost-sensitive Learning. Utilizing a cost-sensitive learning perspective, this research aims to enhance the performance of several classical machine learning algorithms in customer churn prediction tasks. Based on the experimental results classical machine learning algorithms exhibit shorter execution times, making them suitable for predicting churn in large customer bases.
Results: This study conducted an experiment with six comparative sampling methods, six datasets, and three machine learning algorithms. The results show that CostLearnGAN achieved a satisfying result across all evaluation metrics with a 1.44 average mean rank score. Additionally, this study provided a robustness measurement for algorithms, demonstrating that CostLearnGAN outperforms other sampling methods in improving the performance of classical machine learning models with a 5.68 robustness value on average.
期刊介绍:
PeerJ Computer Science is the new open access journal covering all subject areas in computer science, with the backing of a prestigious advisory board and more than 300 academic editors.