{"title":"Efficient and proper generalised linear models with power link functions","authors":"Vali Asimit , Alexandru Badescu , Ziwei Chen , Feng Zhou","doi":"10.1016/j.insmatheco.2025.02.005","DOIUrl":null,"url":null,"abstract":"<div><div>The generalised linear model is a flexible predictive model for observational data that is widely used in practice as it extends linear regression models to non-Gaussian data. In this paper, we introduce the concept of a properly defined generalised linear model by requiring the conditional mean of the response variable to be properly mapped through the chosen link function and the log-likelihood function to be concave. We provide a comprehensive classification of proper generalised linear models for the Tweedie family and its popular subclasses under different link function specifications. Our main theoretical findings show that most Tweedie generalised linear models are not proper for canonical and log link functions, and identify a rich class of proper Tweedie generalised linear models with power link functions. We provide a novel interpretability methodology for power link functions that is mathematically sound and very simple, which could help the adoption of such a link function that has not been used much in practice for its lack of interpretability. Using self-concordant log-likelihoods and linearisation techniques, we provide novel algorithms for estimating several special cases of proper and not proper Tweedie generalised linear models with power link functions. The effectiveness of our methods is determined through an extensive numerical comparison of our estimates and those obtained using three built-in packages, <strong>MATLAB</strong> <em>fitglm</em>, <strong>R</strong> <em>glm</em>2 and <strong>Python</strong> <span><math><mi>sm</mi><mo>.</mo><mi>GLM</mi></math></span> libraries, which are all implemented based on the standard Iteratively Reweighted Least Squares method. Overall, we find that our algorithms consistently outperform these benchmarks in terms of both accuracy and efficiency, the largest improvements being documented for high-dimensional settings. This is concluded for both simulated data and real data, which shows that our optimisation-based GLM implementation is a good alternative to the standard Iteratively Reweighted Least Squares implementations available in well-known software.</div></div>","PeriodicalId":54974,"journal":{"name":"Insurance Mathematics & Economics","volume":"122 ","pages":"Pages 91-118"},"PeriodicalIF":2.2000,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Insurance Mathematics & Economics","FirstCategoryId":"96","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167668725000368","RegionNum":2,"RegionCategory":"经济学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ECONOMICS","Score":null,"Total":0}
引用次数: 0
Abstract
The generalised linear model is a flexible predictive model for observational data that is widely used in practice as it extends linear regression models to non-Gaussian data. In this paper, we introduce the concept of a properly defined generalised linear model by requiring the conditional mean of the response variable to be properly mapped through the chosen link function and the log-likelihood function to be concave. We provide a comprehensive classification of proper generalised linear models for the Tweedie family and its popular subclasses under different link function specifications. Our main theoretical findings show that most Tweedie generalised linear models are not proper for canonical and log link functions, and identify a rich class of proper Tweedie generalised linear models with power link functions. We provide a novel interpretability methodology for power link functions that is mathematically sound and very simple, which could help the adoption of such a link function that has not been used much in practice for its lack of interpretability. Using self-concordant log-likelihoods and linearisation techniques, we provide novel algorithms for estimating several special cases of proper and not proper Tweedie generalised linear models with power link functions. The effectiveness of our methods is determined through an extensive numerical comparison of our estimates and those obtained using three built-in packages, MATLABfitglm, Rglm2 and Python libraries, which are all implemented based on the standard Iteratively Reweighted Least Squares method. Overall, we find that our algorithms consistently outperform these benchmarks in terms of both accuracy and efficiency, the largest improvements being documented for high-dimensional settings. This is concluded for both simulated data and real data, which shows that our optimisation-based GLM implementation is a good alternative to the standard Iteratively Reweighted Least Squares implementations available in well-known software.
广义线性模型是一种灵活的观测数据预测模型,它将线性回归模型扩展到非高斯数据,在实践中得到了广泛的应用。在本文中,我们通过要求响应变量的条件均值通过所选择的链接函数得到适当的映射,并且对数似然函数是凹的,从而引入了适当定义的广义线性模型的概念。我们在不同的链接功能规范下为Tweedie家族及其流行的子类提供了适当的广义线性模型的全面分类。我们的主要理论发现表明,大多数Tweedie广义线性模型不适用于规范和对数链接函数,并确定了一类具有幂链接函数的适当Tweedie广义线性模型。我们为电力链路函数提供了一种新颖的可解释性方法,该方法在数学上合理且非常简单,可以帮助采用这种由于缺乏可解释性而在实践中未被大量使用的链路函数。利用自协调对数似然和线性化技术,我们提供了新的算法来估计几种特殊情况下的正确和不正确的Tweedie广义线性模型与幂链路函数。我们的方法的有效性是通过对我们的估计和使用三个内置包(MATLAB fitglm, R glm2和Python sm)获得的估计进行广泛的数值比较来确定的。GLM库,它们都是基于标准的迭代加权最小二乘法实现的。总的来说,我们发现我们的算法在准确性和效率方面始终优于这些基准,最大的改进被记录为高维设置。模拟数据和真实数据都得出了这一结论,这表明我们基于优化的GLM实现是知名软件中可用的标准迭代加权最小二乘实现的良好替代方案。
期刊介绍:
Insurance: Mathematics and Economics publishes leading research spanning all fields of actuarial science research. It appears six times per year and is the largest journal in actuarial science research around the world.
Insurance: Mathematics and Economics is an international academic journal that aims to strengthen the communication between individuals and groups who develop and apply research results in actuarial science. The journal feels a particular obligation to facilitate closer cooperation between those who conduct research in insurance mathematics and quantitative insurance economics, and practicing actuaries who are interested in the implementation of the results. To this purpose, Insurance: Mathematics and Economics publishes high-quality articles of broad international interest, concerned with either the theory of insurance mathematics and quantitative insurance economics or the inventive application of it, including empirical or experimental results. Articles that combine several of these aspects are particularly considered.