Identification of Distribution Laws Using the Correlation Coefficient Using Python

D. Losikhin, O. Oliynyk, O. Chorna, О. Gnatko
{"title":"Identification of Distribution Laws Using the Correlation Coefficient Using Python","authors":"D. Losikhin, O. Oliynyk, O. Chorna, О. Gnatko","doi":"10.33955/2307-2180(6)2018.36-38","DOIUrl":null,"url":null,"abstract":"The article is devoted to the development of a new method for identifying the distribution laws when evaluating the results of multiple measurements. The identification of the distribution laws is today an urgent metrological task, since the adopted restrictions on the number of measurements and assumptions about the distribution law of random error may introduce additional uncertainty in the assessment of the measurement result. \nThe use of well-known classical approaches to the identification of distribution laws is associated with a number of difficulties associated with the need to use the completeness of the considered set of models and the correct application of the corresponding statistical methods. The main limitation associated with the use of classical approaches to the identification of distribution laws is that they are designed for use in data processing systems based on Gaussian distribution (normal) and, thus, are not universal. The imperfection of mathematical models of processing measurement information leads to the possible erroneous identification of the distribution law. \nThe paper proposes a method for identifying the distribution laws for data outside the Gaussian distribution region. The model is based on the calculation of correlation coefficients for data with different distribution laws. The correlation coefficient is used to estimate the proximity of probability density functions and is calculated for pairs of different probability densities represented by histograms in a multidimensional vector space on an orthonormal basis of unit sampling intervals. Based on the obtained matrix of the values ​​of the correlation coefficients, a classification estimate of the unknown distribution laws is performed based on the experimental data of the simulated samples. A listing of the software implementation of the model in the Python software environment is given.","PeriodicalId":52864,"journal":{"name":"Metrologiia ta priladi","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2018-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Metrologiia ta priladi","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.33955/2307-2180(6)2018.36-38","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The article is devoted to the development of a new method for identifying the distribution laws when evaluating the results of multiple measurements. The identification of the distribution laws is today an urgent metrological task, since the adopted restrictions on the number of measurements and assumptions about the distribution law of random error may introduce additional uncertainty in the assessment of the measurement result. The use of well-known classical approaches to the identification of distribution laws is associated with a number of difficulties associated with the need to use the completeness of the considered set of models and the correct application of the corresponding statistical methods. The main limitation associated with the use of classical approaches to the identification of distribution laws is that they are designed for use in data processing systems based on Gaussian distribution (normal) and, thus, are not universal. The imperfection of mathematical models of processing measurement information leads to the possible erroneous identification of the distribution law. The paper proposes a method for identifying the distribution laws for data outside the Gaussian distribution region. The model is based on the calculation of correlation coefficients for data with different distribution laws. The correlation coefficient is used to estimate the proximity of probability density functions and is calculated for pairs of different probability densities represented by histograms in a multidimensional vector space on an orthonormal basis of unit sampling intervals. Based on the obtained matrix of the values ​​of the correlation coefficients, a classification estimate of the unknown distribution laws is performed based on the experimental data of the simulated samples. A listing of the software implementation of the model in the Python software environment is given.
利用Python中的相关系数识别分布规律
本文致力于发展一种新的方法,用于在评价多次测量结果时识别分布规律。由于对测量次数的限制和对随机误差分布规律的假设可能会给测量结果的评定带来额外的不确定性,因此确定分布规律是当今一项紧迫的计量任务。使用众所周知的经典方法来确定分布规律,与需要使用所考虑的模型集的完整性和正确应用相应的统计方法的许多困难有关。与使用经典方法识别分布规律相关的主要限制是,它们是为基于高斯分布(正态)的数据处理系统设计的,因此,不是通用的。由于测量信息处理数学模型的不完善,可能导致对分布规律的错误识别。本文提出了一种识别高斯分布区外数据分布规律的方法。该模型基于计算不同分布规律的数据的相关系数。相关系数用于估计概率密度函数的接近性,并在单位采样间隔的正交基础上计算多维向量空间中由直方图表示的不同概率密度对。根据得到的相关系数值矩阵,根据模拟样本的实验数据对未知分布规律进行分类估计。给出了该模型在Python软件环境中的软件实现清单。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
审稿时长
5 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信