{"title":"On the effect of confounding in linear regression models: an approach based on the theory of quadratic forms","authors":"Martina Narcisi, Fedele Greco, Carlo Trivisano","doi":"10.1007/s10651-024-00604-y","DOIUrl":null,"url":null,"abstract":"<p>In the last two decades, significant research efforts have been dedicated to addressing the issue of spatial confounding in linear regression models. Confounding occurs when the relationship between the covariate and the response variable is influenced by an unmeasured confounder associated with both. This results in biased estimators for the regression coefficients reduced efficiency, and misleading interpretations. This article aims to understand how confounding relates to the parameters of the data generating process. The sampling properties of the regression coefficient estimator are derived as ratios of dependent quadratic forms in Gaussian random variables: this allows us to obtain exact expressions for the marginal bias and variance of the estimator, that were not obtained in previous studies. Moreover, we provide an approximate measure of the marginal bias that gives insights of the main determinants of bias. Applications in the framework of geostatistical and areal data modeling are presented. Particular attention is devoted to the difference between smoothness and variability of random vectors involved in the data generating process. Results indicate that marginal covariance between the covariate and the confounder, along with marginal variability of the covariate, play the most relevant role in determining the magnitude of confounding, as measured by the bias.</p>","PeriodicalId":50519,"journal":{"name":"Environmental and Ecological Statistics","volume":"25 1","pages":""},"PeriodicalIF":3.0000,"publicationDate":"2024-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Environmental and Ecological Statistics","FirstCategoryId":"93","ListUrlMain":"https://doi.org/10.1007/s10651-024-00604-y","RegionNum":4,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
In the last two decades, significant research efforts have been dedicated to addressing the issue of spatial confounding in linear regression models. Confounding occurs when the relationship between the covariate and the response variable is influenced by an unmeasured confounder associated with both. This results in biased estimators for the regression coefficients reduced efficiency, and misleading interpretations. This article aims to understand how confounding relates to the parameters of the data generating process. The sampling properties of the regression coefficient estimator are derived as ratios of dependent quadratic forms in Gaussian random variables: this allows us to obtain exact expressions for the marginal bias and variance of the estimator, that were not obtained in previous studies. Moreover, we provide an approximate measure of the marginal bias that gives insights of the main determinants of bias. Applications in the framework of geostatistical and areal data modeling are presented. Particular attention is devoted to the difference between smoothness and variability of random vectors involved in the data generating process. Results indicate that marginal covariance between the covariate and the confounder, along with marginal variability of the covariate, play the most relevant role in determining the magnitude of confounding, as measured by the bias.
期刊介绍:
Environmental and Ecological Statistics publishes papers on practical applications of statistics and related quantitative methods to environmental science addressing contemporary issues.
Emphasis is on applied mathematical statistics, statistical methodology, and data interpretation and improvement for future use, with a view to advance statistics for environment, ecology and environmental health, and to advance environmental theory and practice using valid statistics.
Besides clarity of exposition, a single most important criterion for publication is the appropriateness of the statistical method to the particular environmental problem. The Journal covers all aspects of the collection, analysis, presentation and interpretation of environmental data for research, policy and regulation. The Journal is cross-disciplinary within the context of contemporary environmental issues and the associated statistical tools, concepts and methods. The Journal broadly covers theory and methods, case studies and applications, environmental change and statistical ecology, environmental health statistics and stochastics, and related areas. Special features include invited discussion papers; research communications; technical notes and consultation corner; mini-reviews; letters to the Editor; news, views and announcements; hardware and software reviews; data management etc.