Quantifying the relationship between observed variables with censored values using Bayesian error-in-variables regression

IF 8.1 2区 环境科学与生态学 Q1 ENVIRONMENTAL SCIENCES
Peter Vermeiren , Sandrine Charles , Cynthia C. Muñoz
{"title":"Quantifying the relationship between observed variables with censored values using Bayesian error-in-variables regression","authors":"Peter Vermeiren ,&nbsp;Sandrine Charles ,&nbsp;Cynthia C. Muñoz","doi":"10.1016/j.chemosphere.2025.144269","DOIUrl":null,"url":null,"abstract":"<div><div>We aimed to address two common challenges for scientists working with observational data: “how to quantify the relationship between two observed (or measured) variables”, and, “how to account for censored values” (i.e., observations or measures whose value is only known to fall within a range). Quantifying the relationship between observed variables, and predicting one variable from the other (and vice versa), violates the assumption of standard regression regarding the existence of an independent, explanatory variable that is observed with no (or limited) uncertainty. To overcome this challenge, we developed and tested a Bayesian error-in-variables, EIV, regression model which accounts for uncertainty in variables orthogonally. Moreover, parameter estimation using Bayesian inference allowed the full parameter uncertainty to be propagated into probabilistic model predictions suitable for decision making. Alternative model formulations were applied to a dataset containing measured concentrations of organic pollutants in mothers and their eggs from the freshwater turtle <em>Malaclemys terrapin</em> and validated against an independent dataset of the turtle <em>Chelydra serpentina</em>. The best performing EIV model was then applied to the dataset again after censoring measurements in one or both variables. Here, independent likelihoods for both censored and uncensored data were formulated and then easily combined following the Bayesian implementation of the model. The EIV model performed well, as revealed by posterior predictive checks around 85%, and obtained comparable parameter estimates in both censored and uncensored cases. The resulting model allows scientists and decision-makers to quantitatively link variables, and make predictions from one variable to the next while accounting for uncertainties and censored data.</div></div>","PeriodicalId":276,"journal":{"name":"Chemosphere","volume":"376 ","pages":"Article 144269"},"PeriodicalIF":8.1000,"publicationDate":"2025-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Chemosphere","FirstCategoryId":"93","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0045653525002115","RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0

Abstract

We aimed to address two common challenges for scientists working with observational data: “how to quantify the relationship between two observed (or measured) variables”, and, “how to account for censored values” (i.e., observations or measures whose value is only known to fall within a range). Quantifying the relationship between observed variables, and predicting one variable from the other (and vice versa), violates the assumption of standard regression regarding the existence of an independent, explanatory variable that is observed with no (or limited) uncertainty. To overcome this challenge, we developed and tested a Bayesian error-in-variables, EIV, regression model which accounts for uncertainty in variables orthogonally. Moreover, parameter estimation using Bayesian inference allowed the full parameter uncertainty to be propagated into probabilistic model predictions suitable for decision making. Alternative model formulations were applied to a dataset containing measured concentrations of organic pollutants in mothers and their eggs from the freshwater turtle Malaclemys terrapin and validated against an independent dataset of the turtle Chelydra serpentina. The best performing EIV model was then applied to the dataset again after censoring measurements in one or both variables. Here, independent likelihoods for both censored and uncensored data were formulated and then easily combined following the Bayesian implementation of the model. The EIV model performed well, as revealed by posterior predictive checks around 85%, and obtained comparable parameter estimates in both censored and uncensored cases. The resulting model allows scientists and decision-makers to quantitatively link variables, and make predictions from one variable to the next while accounting for uncertainties and censored data.

Abstract Image

使用贝叶斯变量误差回归量化观测变量与截尾值之间的关系
我们的目标是解决科学家处理观测数据的两个共同挑战:“如何量化两个观测(或测量)变量之间的关系”,以及“如何解释审查值”(即,其值仅在已知范围内的观测或测量值)。量化观察到的变量之间的关系,并预测一个变量与另一个变量之间的关系(反之亦然),违反了标准回归的假设,即存在一个独立的解释变量,该变量被观察到没有(或有限)不确定性。为了克服这一挑战,我们开发并测试了贝叶斯变量误差(EIV)回归模型,该模型以正交方式解释变量的不确定性。此外,使用贝叶斯推理的参数估计允许将全部参数不确定性传播到适合决策的概率模型预测中。将不同的模型公式应用于包含淡水龟(Malaclemys terrapin)母亲及其卵中有机污染物测量浓度的数据集,并针对Chelydra serpentina龟的独立数据集进行验证。然后,在对一个或两个变量的测量进行审查后,将表现最佳的EIV模型再次应用于数据集。在这里,经过审查和未经审查的数据的独立似然被制定出来,然后在模型的贝叶斯实现之后很容易地结合起来。EIV模型表现良好,后验预测检查约为85%,并且在删减和未删减的情况下都获得了可比较的参数估计。由此产生的模型使科学家和决策者能够定量地将变量联系起来,并在考虑不确定性和审查数据的同时,从一个变量到下一个变量做出预测。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Chemosphere
Chemosphere 环境科学-环境科学
CiteScore
15.80
自引率
8.00%
发文量
4975
审稿时长
3.4 months
期刊介绍: Chemosphere, being an international multidisciplinary journal, is dedicated to publishing original communications and review articles on chemicals in the environment. The scope covers a wide range of topics, including the identification, quantification, behavior, fate, toxicology, treatment, and remediation of chemicals in the bio-, hydro-, litho-, and atmosphere, ensuring the broad dissemination of research in this field.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信