Software Implementation of the Epps-Pulley Criterion in Matlab Modeling Environment

A. A. Tipikin, A. A. Prusakov, N. A. Timoshenko
{"title":"Software Implementation of the Epps-Pulley Criterion in Matlab Modeling Environment","authors":"A. A. Tipikin, A. A. Prusakov, N. A. Timoshenko","doi":"10.21686/1818-4243-2024-2-59-72","DOIUrl":null,"url":null,"abstract":"Purpose. Modeling systems and programming platforms provide ample opportunities for the use of statistical tools in research activities. Since the normal distribution is one of the most common distribution laws, the criterion for checking the sample for normality is in high demand among statistical assessment tools, among which the Epps-Pulley test has the status as one of the most powerful tests to check the deviation of the distribution from the normal one. There are a number of implementations of this test in the R and Python languages. However, this test is not implemented in one of the most popular Matlab modeling software. Thus, the purpose of this study is to develop a software implementation of the Epps-Pulley criterion in the Matlab environment and verify the correctness of the performed calculations.Materials and Methods. We implemented the calculation of Epps-Pulley statistics by two methods – classical, using cycles, and matrix-vector, using linear algebra operations. The classical method requires calculating the intermediate values necessary to obtain the criterion statistics using two independent cycles, the second cycle being a double one, in which one cycle is nested into the other. The matrix-vector method requires fewer code by performing calculations using linear algebra operations on matrices and vectors. We obtained critical statistical values for the sample size from 8 to 1000 observations with two-dimensional linear interpolation of tabular values. We used an approximation by a beta function of the third kind for a sample of over 1000 elements.Results. An assessment of the computational efficiency of the methods showed that the cyclic approach is about three times higher than the matrix-vector approach in terms of consumed time, which is presumably due to the processing of insignificant elements in triangular matrices when performing component-by-component operations. The correctness of the software implementation of the Epps-Pulley criterion was tested on several examples, which confirmed the compliance of the calculated values of the criterion statistics, as well as the critical values of statistics, with known data. We carried out a criterion statistical evaluation based on the empirical values of the error of the first kind. We obtained the error values correspondence to the specified significance levels. We performed comparative estimates of the Epps-Pulley test with the Anders-Darling and Shapiro-Wilk tests in terms of the criterion empirical power and tabulated the evaluation results. We published the software implementation of the Epps-Pulley test on the MATLAB Central Internet resource and for free use.Conclusion. We developed software implementation of the Epps-Pulley criterion as a new research tool that was previously unavailable in the Matlab modeling environment. We used the time spent on calculations to make a reasonable choice of the calculation algorithm for the criterion statistics. We confirmed correctness of the calculation algorithms by a set of selective checks and statistical estimates that showed the compliance with well-known theoretical provisions.","PeriodicalId":514994,"journal":{"name":"Open Education","volume":"128 46","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Open Education","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21686/1818-4243-2024-2-59-72","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Purpose. Modeling systems and programming platforms provide ample opportunities for the use of statistical tools in research activities. Since the normal distribution is one of the most common distribution laws, the criterion for checking the sample for normality is in high demand among statistical assessment tools, among which the Epps-Pulley test has the status as one of the most powerful tests to check the deviation of the distribution from the normal one. There are a number of implementations of this test in the R and Python languages. However, this test is not implemented in one of the most popular Matlab modeling software. Thus, the purpose of this study is to develop a software implementation of the Epps-Pulley criterion in the Matlab environment and verify the correctness of the performed calculations.Materials and Methods. We implemented the calculation of Epps-Pulley statistics by two methods – classical, using cycles, and matrix-vector, using linear algebra operations. The classical method requires calculating the intermediate values necessary to obtain the criterion statistics using two independent cycles, the second cycle being a double one, in which one cycle is nested into the other. The matrix-vector method requires fewer code by performing calculations using linear algebra operations on matrices and vectors. We obtained critical statistical values for the sample size from 8 to 1000 observations with two-dimensional linear interpolation of tabular values. We used an approximation by a beta function of the third kind for a sample of over 1000 elements.Results. An assessment of the computational efficiency of the methods showed that the cyclic approach is about three times higher than the matrix-vector approach in terms of consumed time, which is presumably due to the processing of insignificant elements in triangular matrices when performing component-by-component operations. The correctness of the software implementation of the Epps-Pulley criterion was tested on several examples, which confirmed the compliance of the calculated values of the criterion statistics, as well as the critical values of statistics, with known data. We carried out a criterion statistical evaluation based on the empirical values of the error of the first kind. We obtained the error values correspondence to the specified significance levels. We performed comparative estimates of the Epps-Pulley test with the Anders-Darling and Shapiro-Wilk tests in terms of the criterion empirical power and tabulated the evaluation results. We published the software implementation of the Epps-Pulley test on the MATLAB Central Internet resource and for free use.Conclusion. We developed software implementation of the Epps-Pulley criterion as a new research tool that was previously unavailable in the Matlab modeling environment. We used the time spent on calculations to make a reasonable choice of the calculation algorithm for the criterion statistics. We confirmed correctness of the calculation algorithms by a set of selective checks and statistical estimates that showed the compliance with well-known theoretical provisions.
Matlab 建模环境中 Epps-Pulley 准则的软件实现
目的建模系统和编程平台为在研究活动中使用统计工具提供了大量机会。由于正态分布是最常见的分布规律之一,因此统计评估工具对检查样本正态性的标准有很高的要求,其中 Epps-Pulley 检验是检查分布偏离正态分布最强大的检验之一。在 R 和 Python 语言中,有许多该检验的实现方法。然而,最流行的 Matlab 建模软件之一却没有实现该检验。因此,本研究的目的是在 Matlab 环境中开发 Epps-Pulley 准则的软件实现,并验证所执行计算的正确性。我们通过两种方法实现了 Epps-Pulley 统计的计算:一种是使用循环的经典方法,另一种是使用线性代数运算的矩阵矢量方法。经典方法要求使用两个独立循环来计算获得标准统计数据所需的中间值,第二个循环是双循环,其中一个循环嵌套到另一个循环中。矩阵矢量法通过对矩阵和矢量进行线性代数运算,所需代码较少。我们通过对表格中的数值进行二维线性插值,获得了样本量从 8 到 1000 个观测值的临界统计值。对于超过 1000 个元素的样本,我们使用了第三类贝塔函数的近似值。对这些方法的计算效率进行的评估显示,就消耗时间而言,循环方法比矩阵-向量方法高出约三倍,这可能是由于在进行逐个分量运算时处理了三角形矩阵中不重要的元素。我们在几个实例中测试了 Epps-Pulley 准则软件实施的正确性,结果证实准则统计量的计算值和统计量临界值与已知数据相符。我们根据第一类误差的经验值进行了标准统计评估。我们获得了与指定显著性水平相对应的误差值。我们对 Epps-Pulley 检验与 Anders-Darling 和 Shapiro-Wilk 检验在标准经验能力方面进行了比较估计,并将评估结果制成表格。我们在 MATLAB Central 互联网资源上发布了 Epps-Pulley 检验的软件实现,供免费使用。我们开发了 Epps-Pulley 准则的软件实现,作为 Matlab 建模环境中以前没有的新研究工具。我们利用计算所花费的时间合理选择了准则统计的计算算法。我们通过一系列选择性检查和统计估算确认了计算算法的正确性,这些检查和估算表明计算算法符合众所周知的理论规定。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信