{"title":"Modeling the Nearest Neighbor Graphs to Estimate the Probability of the Independence of Data","authors":"A. A. Kislitsyn","doi":"10.1134/s2070048223070086","DOIUrl":null,"url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Abstract</h3><p>The proposed method is based on calculations of the statistics of the nearest neighbor graph (NNG) structures, which are presented as a benchmark of the probabilities of the distribution of graphs by the number of disconnected fragments. The deviation of the actually observed occurrence of connectivity from the calculated one will allow us to determine the probability that this sample can be considered a set of statistically independent variables. The statements about the independence of the NNG statistics from the distribution of distances and from the triangle inequality are proved, which allows the numerical modeling of such structures. Estimates of the accuracy of the calculated statistics for graphs and their comparison with estimates obtained by modeling random coordinates of points in <i>d</i>-dimensional space are carried out. It is shown that the model of the NNGs without taking into account the dimension of the space leads to fairly accurate estimates of the statistics of graph structures in spaces of dimensionality higher than five. For spaces of smaller dimensionality, the benchmark can be obtained by directly calculating the distances between points with random coordinates in a unit cube. The proposed method is applied to the problem of analyzing the level of unsteadiness of the earthquake catalog in the Kuril–Kamchatka region. The lengths of samples of time intervals between neighboring events are analyzed. It is shown that the analyzed system as a whole is interconnected with a probability of 0.91, and this dependence is fundamentally different from the lag correlation between the sample elements.</p>","PeriodicalId":38050,"journal":{"name":"Mathematical Models and Computer Simulations","volume":"2 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Mathematical Models and Computer Simulations","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1134/s2070048223070086","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Mathematics","Score":null,"Total":0}
引用次数: 0
Abstract
The proposed method is based on calculations of the statistics of the nearest neighbor graph (NNG) structures, which are presented as a benchmark of the probabilities of the distribution of graphs by the number of disconnected fragments. The deviation of the actually observed occurrence of connectivity from the calculated one will allow us to determine the probability that this sample can be considered a set of statistically independent variables. The statements about the independence of the NNG statistics from the distribution of distances and from the triangle inequality are proved, which allows the numerical modeling of such structures. Estimates of the accuracy of the calculated statistics for graphs and their comparison with estimates obtained by modeling random coordinates of points in d-dimensional space are carried out. It is shown that the model of the NNGs without taking into account the dimension of the space leads to fairly accurate estimates of the statistics of graph structures in spaces of dimensionality higher than five. For spaces of smaller dimensionality, the benchmark can be obtained by directly calculating the distances between points with random coordinates in a unit cube. The proposed method is applied to the problem of analyzing the level of unsteadiness of the earthquake catalog in the Kuril–Kamchatka region. The lengths of samples of time intervals between neighboring events are analyzed. It is shown that the analyzed system as a whole is interconnected with a probability of 0.91, and this dependence is fundamentally different from the lag correlation between the sample elements.
期刊介绍:
Mathematical Models and Computer Simulations is a journal that publishes high-quality and original articles at the forefront of development of mathematical models, numerical methods, computer-assisted studies in science and engineering with the potential for impact across the sciences, and construction of massively parallel codes for supercomputers. The problem-oriented papers are devoted to various problems including industrial mathematics, numerical simulation in multiscale and multiphysics, materials science, chemistry, economics, social, and life sciences.