Bayesian Approach for Comparing Parameter Estimation of Regression Model for Outlier Data

Autcha Araveeporn, K. Kumnungkit
{"title":"Bayesian Approach for Comparing Parameter Estimation of Regression Model for Outlier Data","authors":"Autcha Araveeporn, K. Kumnungkit","doi":"10.1145/3545839.3545841","DOIUrl":null,"url":null,"abstract":"This research compares and contrasts the simple regression model's parameter estimation methods, which consisted of a dependent variable and one independent variable. Parameter estimation uses the ordinary least square method, Bayesian method, Markov Chain Monte Carlo (MCMC) method, and local weight Markov Chain Monte Carlo (LWMCMC) method. The standard method is the ordinary least square method, which uses the concept of minimum sum square error to estimate parameters for fitting the linear regression model. However, for a set of the parameter relating to the Bayesian approach, the use of prior and posterior distributions may affect the approximation of the Bayesian, MCMC, LWMCMC methods. This paper considers the ordinal least square method and Bayesian approach by estimating the parameter for outlier data while some data points are far from other observations. The independent variable is simulated from the contaminated normal distribution, and the error is simulated from the normal distribution that made the outlier data on dependent and independent variables for the several sample sizes as 20, 50, 100, and 200. The criterion of the best efficiency is considered by the minimum of the average mean square errors. Through simulation data, the Bayesian method presents the minimum of average mean square errors at the sample sizes 20 and 50. However, when the sample size value increases, the MCMC and LWMCMC method are the best efficiency method at the sample sizes 100 and 200, respectively.","PeriodicalId":249161,"journal":{"name":"Proceedings of the 2022 5th International Conference on Mathematics and Statistics","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2022 5th International Conference on Mathematics and Statistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3545839.3545841","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

This research compares and contrasts the simple regression model's parameter estimation methods, which consisted of a dependent variable and one independent variable. Parameter estimation uses the ordinary least square method, Bayesian method, Markov Chain Monte Carlo (MCMC) method, and local weight Markov Chain Monte Carlo (LWMCMC) method. The standard method is the ordinary least square method, which uses the concept of minimum sum square error to estimate parameters for fitting the linear regression model. However, for a set of the parameter relating to the Bayesian approach, the use of prior and posterior distributions may affect the approximation of the Bayesian, MCMC, LWMCMC methods. This paper considers the ordinal least square method and Bayesian approach by estimating the parameter for outlier data while some data points are far from other observations. The independent variable is simulated from the contaminated normal distribution, and the error is simulated from the normal distribution that made the outlier data on dependent and independent variables for the several sample sizes as 20, 50, 100, and 200. The criterion of the best efficiency is considered by the minimum of the average mean square errors. Through simulation data, the Bayesian method presents the minimum of average mean square errors at the sample sizes 20 and 50. However, when the sample size value increases, the MCMC and LWMCMC method are the best efficiency method at the sample sizes 100 and 200, respectively.
离群数据回归模型参数估计比较的贝叶斯方法
本研究比较和对比了由一个因变量和一个自变量组成的简单回归模型的参数估计方法。参数估计采用普通最小二乘法、贝叶斯方法、马尔可夫链蒙特卡罗(MCMC)方法和局部权值马尔可夫链蒙特卡罗(LWMCMC)方法。标准方法是普通最小二乘法,它使用最小和平方误差的概念来估计参数,以拟合线性回归模型。然而,对于一组与贝叶斯方法相关的参数,使用先验和后验分布可能会影响贝叶斯,MCMC, LWMCMC方法的近似。本文采用有序最小二乘法和贝叶斯方法对离群数据进行参数估计。自变量从受污染的正态分布模拟,误差从正态分布模拟,使因变量和自变量的离群数据在几个样本量为20、50、100和200。最佳效率的标准是平均均方误差最小。通过仿真数据,贝叶斯方法给出了样本容量为20和50时平均均方误差的最小值。当样本量增大时,MCMC法和LWMCMC法分别在样本量为100和200时效率最高。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信