Suman Rakshit, Greg McSwiggan, Gopalan Nair, Adrian Baddeley
{"title":"Variable selection using penalised likelihoods for point patterns on a linear network","authors":"Suman Rakshit, Greg McSwiggan, Gopalan Nair, Adrian Baddeley","doi":"10.1111/anzs.12341","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>Motivated by the analysis of a comprehensive database of road traffic accidents, we investigate methods of variable selection for spatial point process models on a linear network. The original data may include explanatory spatial covariates, such as road curvature, and ‘mark’ variables attributed to individual accidents, such as accident severity. The treatment of mark variables is new. Variable selection is applied to the canonical covariates, which may include spatial covariate effects, mark effects and mark-covariate interactions. We approximate the likelihood of the point process model by that of a generalised linear model, in such a way that spatial covariates and marks are both associated with canonical covariates. We impose a convex penalty on the log likelihood, principally the elastic-net penalty, and maximise the penalised loglikelihood by cyclic coordinate ascent. A simulation study compares the performances of the lasso, ridge regression and elastic-net methods of variable selection on their ability to select variables correctly, and on their bias and standard error. Standard techniques for selecting the regularisation parameter <i>γ</i> often yielded unsatisfactory results. We propose two new rules for selecting <i>γ</i> which are designed to have better performance. The methods are tested on a small dataset on crimes in a Chicago neighbourhood, and applied to a large dataset of road traffic accidents in Western Australia.</p>\n </div>","PeriodicalId":0,"journal":{"name":"","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"","FirstCategoryId":"100","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/anzs.12341","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
Motivated by the analysis of a comprehensive database of road traffic accidents, we investigate methods of variable selection for spatial point process models on a linear network. The original data may include explanatory spatial covariates, such as road curvature, and ‘mark’ variables attributed to individual accidents, such as accident severity. The treatment of mark variables is new. Variable selection is applied to the canonical covariates, which may include spatial covariate effects, mark effects and mark-covariate interactions. We approximate the likelihood of the point process model by that of a generalised linear model, in such a way that spatial covariates and marks are both associated with canonical covariates. We impose a convex penalty on the log likelihood, principally the elastic-net penalty, and maximise the penalised loglikelihood by cyclic coordinate ascent. A simulation study compares the performances of the lasso, ridge regression and elastic-net methods of variable selection on their ability to select variables correctly, and on their bias and standard error. Standard techniques for selecting the regularisation parameter γ often yielded unsatisfactory results. We propose two new rules for selecting γ which are designed to have better performance. The methods are tested on a small dataset on crimes in a Chicago neighbourhood, and applied to a large dataset of road traffic accidents in Western Australia.