非参数拟合优度检验在测量结果处理任务中的应用问题

Analysis and data processing systems Pub Date : 2021-06-18 DOI:10.17212/2782-2001-2021-2-47-66

B. Lemeshko, S. Lemeshko

{"title":"非参数拟合优度检验在测量结果处理任务中的应用问题","authors":"B. Lemeshko, S. Lemeshko","doi":"10.17212/2782-2001-2021-2-47-66","DOIUrl":null,"url":null,"abstract":"It is argued that in most cases two reasons underlie the incorrect application of nonparametric goodness-of-fit tests in various applications. The first reason is that when testing composite hypotheses and evaluating the parameters of the law for the analyzed sample, classical results associated with testing simple hypotheses are used. When testing composite hypotheses, the distributions of goodness-of-fit statistics are influenced by the form of the observed law F(x, q) corresponding to the hypothesis being tested, by the type and number of estimated parameters, by the estimation method, and in some cases by the value of the shape parameter. The paper shows the influence of all mentiomed factors on the distribution of test statistics. It is emphasized that, when testing composite hypotheses, the neglect, of the fact that the test has lost the property of “freedom from distribution” leads to an increase in the probability of the 2nd kind errors. It is shown that the distribution of the statistics of the test necessary for the formation of a conclusion about the results of testing a composite hypothesis can be found using simulation in an interactive mode directly in the process of testing. The second reason is associated with the presence of round-off errors which can significantly change the distributions of test statistics. The paper shows that asymptotic results when testing simple and composite hypotheses can be used with round -off errors D much less than the standard deviation s of the distribution law of measurement errors and sample sizes n not exceeding some maximum values. For sample sizes larger than these maximum values, the real distributions of the test statistics deviate from asymptotic ones towards larger statistics values. In such situations, the use of asymptotic distributions to arrive at a conclusion about the test results leads to an increase in the probabilities of errors of the 1st kind (to the rejection of a valid hypothesis being tested). It is shown that when the round-off errors and s are commensurable, the distributions of the test statistics deviate from the asymptotic distributions for small n. And as n grows, the situation only gets worse. In the paper, changes in the distributions of statistics under the influence of rounding are demonstrated both when testing both simple and composite hypotheses. It is shown that the only way out that ensures the correctness of conclusions according to the applied tests in such non-standard conditions is the use of real distributions of statistics. This task can be solved interactively (in the process of verification) and rely on computer research technologies and the apparatus of mathematical statistics.","PeriodicalId":292298,"journal":{"name":"Analysis and data processing systems","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Problems of nonparametric goodness-of-fit test application in tasks of measurement results processing\",\"authors\":\"B. Lemeshko, S. Lemeshko\",\"doi\":\"10.17212/2782-2001-2021-2-47-66\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"It is argued that in most cases two reasons underlie the incorrect application of nonparametric goodness-of-fit tests in various applications. The first reason is that when testing composite hypotheses and evaluating the parameters of the law for the analyzed sample, classical results associated with testing simple hypotheses are used. When testing composite hypotheses, the distributions of goodness-of-fit statistics are influenced by the form of the observed law F(x, q) corresponding to the hypothesis being tested, by the type and number of estimated parameters, by the estimation method, and in some cases by the value of the shape parameter. The paper shows the influence of all mentiomed factors on the distribution of test statistics. It is emphasized that, when testing composite hypotheses, the neglect, of the fact that the test has lost the property of “freedom from distribution” leads to an increase in the probability of the 2nd kind errors. It is shown that the distribution of the statistics of the test necessary for the formation of a conclusion about the results of testing a composite hypothesis can be found using simulation in an interactive mode directly in the process of testing. The second reason is associated with the presence of round-off errors which can significantly change the distributions of test statistics. The paper shows that asymptotic results when testing simple and composite hypotheses can be used with round -off errors D much less than the standard deviation s of the distribution law of measurement errors and sample sizes n not exceeding some maximum values. For sample sizes larger than these maximum values, the real distributions of the test statistics deviate from asymptotic ones towards larger statistics values. In such situations, the use of asymptotic distributions to arrive at a conclusion about the test results leads to an increase in the probabilities of errors of the 1st kind (to the rejection of a valid hypothesis being tested). It is shown that when the round-off errors and s are commensurable, the distributions of the test statistics deviate from the asymptotic distributions for small n. And as n grows, the situation only gets worse. In the paper, changes in the distributions of statistics under the influence of rounding are demonstrated both when testing both simple and composite hypotheses. It is shown that the only way out that ensures the correctness of conclusions according to the applied tests in such non-standard conditions is the use of real distributions of statistics. This task can be solved interactively (in the process of verification) and rely on computer research technologies and the apparatus of mathematical statistics.\",\"PeriodicalId\":292298,\"journal\":{\"name\":\"Analysis and data processing systems\",\"volume\":\"9 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-06-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Analysis and data processing systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.17212/2782-2001-2021-2-47-66\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Analysis and data processing systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.17212/2782-2001-2021-2-47-66","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

本文认为，在大多数情况下，有两个原因导致在各种应用中不正确地应用非参数拟合优度检验。第一个原因是，当测试复合假设和评估被分析样本的规律参数时，使用与测试简单假设相关的经典结果。在检验复合假设时，拟合优度统计量的分布受到以下因素的影响:与待检验假设相对应的观测规律F(x, q)的形式、估计参数的类型和数量、估计方法，以及在某些情况下形状参数的值。分析了上述因素对检验统计量分布的影响。需要强调的是，在检验复合假设时，忽略检验已失去“不受分布影响”的性质会导致第二类误差的概率增加。结果表明，在检验过程中，可以直接以交互方式进行模拟，得到对复合假设的检验结果形成结论所必需的检验统计量的分布。第二个原因与舍入误差的存在有关，舍入误差会显著改变测试统计的分布。本文证明了检验简单假设和复合假设时的渐近结果可以在舍入误差D远小于测量误差和样本量n分布规律的标准差s且不超过某个最大值的情况下使用。对于大于这些最大值的样本量，检验统计量的实际分布偏离渐近分布，趋向于较大的统计值。在这种情况下，使用渐近分布来得出关于测试结果的结论会导致第一类错误的概率增加(拒绝正在测试的有效假设)。结果表明，当舍入误差与s可公时，当n较小时，检验统计量的分布偏离渐近分布。随着n的增大，情况只会变得更糟。本文论证了在检验简单假设和复合假设时，统计量分布在舍入影响下的变化。结果表明，在这种非标准条件下，保证应用检验得出的结论正确的唯一出路是使用统计量的真实分布。这个任务可以交互式地解决(在验证过程中)，并依靠计算机研究技术和数理统计仪器。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Problems of nonparametric goodness-of-fit test application in tasks of measurement results processing

It is argued that in most cases two reasons underlie the incorrect application of nonparametric goodness-of-fit tests in various applications. The first reason is that when testing composite hypotheses and evaluating the parameters of the law for the analyzed sample, classical results associated with testing simple hypotheses are used. When testing composite hypotheses, the distributions of goodness-of-fit statistics are influenced by the form of the observed law F(x, q) corresponding to the hypothesis being tested, by the type and number of estimated parameters, by the estimation method, and in some cases by the value of the shape parameter. The paper shows the influence of all mentiomed factors on the distribution of test statistics. It is emphasized that, when testing composite hypotheses, the neglect, of the fact that the test has lost the property of “freedom from distribution” leads to an increase in the probability of the 2nd kind errors. It is shown that the distribution of the statistics of the test necessary for the formation of a conclusion about the results of testing a composite hypothesis can be found using simulation in an interactive mode directly in the process of testing. The second reason is associated with the presence of round-off errors which can significantly change the distributions of test statistics. The paper shows that asymptotic results when testing simple and composite hypotheses can be used with round -off errors D much less than the standard deviation s of the distribution law of measurement errors and sample sizes n not exceeding some maximum values. For sample sizes larger than these maximum values, the real distributions of the test statistics deviate from asymptotic ones towards larger statistics values. In such situations, the use of asymptotic distributions to arrive at a conclusion about the test results leads to an increase in the probabilities of errors of the 1st kind (to the rejection of a valid hypothesis being tested). It is shown that when the round-off errors and s are commensurable, the distributions of the test statistics deviate from the asymptotic distributions for small n. And as n grows, the situation only gets worse. In the paper, changes in the distributions of statistics under the influence of rounding are demonstrated both when testing both simple and composite hypotheses. It is shown that the only way out that ensures the correctness of conclusions according to the applied tests in such non-standard conditions is the use of real distributions of statistics. This task can be solved interactively (in the process of verification) and rely on computer research technologies and the apparatus of mathematical statistics.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Analysis and data processing systems

自引率

0.00%

发文量