Bayesian Approach for Comparing Parameter Estimation of Regression Model for Outlier Data

Proceedings of the 2022 5th International Conference on Mathematics and Statistics Pub Date : 2022-06-17 DOI:10.1145/3545839.3545841

Autcha Araveeporn, K. Kumnungkit

{"title":"Bayesian Approach for Comparing Parameter Estimation of Regression Model for Outlier Data","authors":"Autcha Araveeporn, K. Kumnungkit","doi":"10.1145/3545839.3545841","DOIUrl":null,"url":null,"abstract":"This research compares and contrasts the simple regression model's parameter estimation methods, which consisted of a dependent variable and one independent variable. Parameter estimation uses the ordinary least square method, Bayesian method, Markov Chain Monte Carlo (MCMC) method, and local weight Markov Chain Monte Carlo (LWMCMC) method. The standard method is the ordinary least square method, which uses the concept of minimum sum square error to estimate parameters for fitting the linear regression model. However, for a set of the parameter relating to the Bayesian approach, the use of prior and posterior distributions may affect the approximation of the Bayesian, MCMC, LWMCMC methods. This paper considers the ordinal least square method and Bayesian approach by estimating the parameter for outlier data while some data points are far from other observations. The independent variable is simulated from the contaminated normal distribution, and the error is simulated from the normal distribution that made the outlier data on dependent and independent variables for the several sample sizes as 20, 50, 100, and 200. The criterion of the best efficiency is considered by the minimum of the average mean square errors. Through simulation data, the Bayesian method presents the minimum of average mean square errors at the sample sizes 20 and 50. However, when the sample size value increases, the MCMC and LWMCMC method are the best efficiency method at the sample sizes 100 and 200, respectively.","PeriodicalId":249161,"journal":{"name":"Proceedings of the 2022 5th International Conference on Mathematics and Statistics","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2022 5th International Conference on Mathematics and Statistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3545839.3545841","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

This research compares and contrasts the simple regression model's parameter estimation methods, which consisted of a dependent variable and one independent variable. Parameter estimation uses the ordinary least square method, Bayesian method, Markov Chain Monte Carlo (MCMC) method, and local weight Markov Chain Monte Carlo (LWMCMC) method. The standard method is the ordinary least square method, which uses the concept of minimum sum square error to estimate parameters for fitting the linear regression model. However, for a set of the parameter relating to the Bayesian approach, the use of prior and posterior distributions may affect the approximation of the Bayesian, MCMC, LWMCMC methods. This paper considers the ordinal least square method and Bayesian approach by estimating the parameter for outlier data while some data points are far from other observations. The independent variable is simulated from the contaminated normal distribution, and the error is simulated from the normal distribution that made the outlier data on dependent and independent variables for the several sample sizes as 20, 50, 100, and 200. The criterion of the best efficiency is considered by the minimum of the average mean square errors. Through simulation data, the Bayesian method presents the minimum of average mean square errors at the sample sizes 20 and 50. However, when the sample size value increases, the MCMC and LWMCMC method are the best efficiency method at the sample sizes 100 and 200, respectively.

查看原文本刊更多论文

离群数据回归模型参数估计比较的贝叶斯方法

本研究比较和对比了由一个因变量和一个自变量组成的简单回归模型的参数估计方法。参数估计采用普通最小二乘法、贝叶斯方法、马尔可夫链蒙特卡罗(MCMC)方法和局部权值马尔可夫链蒙特卡罗(LWMCMC)方法。标准方法是普通最小二乘法，它使用最小和平方误差的概念来估计参数，以拟合线性回归模型。然而，对于一组与贝叶斯方法相关的参数，使用先验和后验分布可能会影响贝叶斯，MCMC, LWMCMC方法的近似。本文采用有序最小二乘法和贝叶斯方法对离群数据进行参数估计。自变量从受污染的正态分布模拟，误差从正态分布模拟，使因变量和自变量的离群数据在几个样本量为20、50、100和200。最佳效率的标准是平均均方误差最小。通过仿真数据，贝叶斯方法给出了样本容量为20和50时平均均方误差的最小值。当样本量增大时，MCMC法和LWMCMC法分别在样本量为100和200时效率最高。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 2022 5th International Conference on Mathematics and Statistics

自引率

0.00%

发文量