{"title":"Uplift modeling with quasi-loss-functions","authors":"Jinping Hu, Evert de Haan, Bernd Skiera","doi":"10.1007/s10618-024-01042-x","DOIUrl":null,"url":null,"abstract":"<p>Uplift modeling, also referred to as heterogeneous treatment effect estimation, is a machine learning technique utilized in marketing for estimating the incremental impact of treatment on the response of each customer. Uplift models face a fundamental challenge in causal inference because the variable of interest (i.e., the uplift itself) remains unobservable. As a result, popular uplift models (such as meta-learners and uplift trees) do not incorporate loss functions for uplifts in their algorithms. This article addresses that gap by proposing uplift models with quasi-loss functions (UpliftQL models), which separately use four specially designed quasi-loss functions for uplift estimation in algorithms. Using simulated data, our analysis reveals that, on average, 55% (34%) of the top five models from a set of 14 are UpliftQL models for binary (continuous) outcomes. Further empirical data analysis shows that over 60% of the top-performing models are consistently UpliftQL models.</p>","PeriodicalId":55183,"journal":{"name":"Data Mining and Knowledge Discovery","volume":"23 1","pages":""},"PeriodicalIF":2.8000,"publicationDate":"2024-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data Mining and Knowledge Discovery","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10618-024-01042-x","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Uplift modeling, also referred to as heterogeneous treatment effect estimation, is a machine learning technique utilized in marketing for estimating the incremental impact of treatment on the response of each customer. Uplift models face a fundamental challenge in causal inference because the variable of interest (i.e., the uplift itself) remains unobservable. As a result, popular uplift models (such as meta-learners and uplift trees) do not incorporate loss functions for uplifts in their algorithms. This article addresses that gap by proposing uplift models with quasi-loss functions (UpliftQL models), which separately use four specially designed quasi-loss functions for uplift estimation in algorithms. Using simulated data, our analysis reveals that, on average, 55% (34%) of the top five models from a set of 14 are UpliftQL models for binary (continuous) outcomes. Further empirical data analysis shows that over 60% of the top-performing models are consistently UpliftQL models.
期刊介绍:
Advances in data gathering, storage, and distribution have created a need for computational tools and techniques to aid in data analysis. Data Mining and Knowledge Discovery in Databases (KDD) is a rapidly growing area of research and application that builds on techniques and theories from many fields, including statistics, databases, pattern recognition and learning, data visualization, uncertainty modelling, data warehousing and OLAP, optimization, and high performance computing.