Nearest neighbour distributions: New statistical measures for cosmological clustering

arXiv: Cosmology and Nongalactic Astrophysics Pub Date : 2020-07-27 DOI:10.1093/mnras/staa3604

Arka Banerjee, T. Abel

{"title":"Nearest neighbour distributions: New statistical measures for cosmological clustering","authors":"Arka Banerjee, T. Abel","doi":"10.1093/mnras/staa3604","DOIUrl":null,"url":null,"abstract":"The use of summary statistics beyond the two-point correlation function to analyze the non-Gaussian clustering on small scales is an active field of research in cosmology. In this paper, we explore a set of new summary statistics -- the $k$-Nearest Neighbor Cumulative Distribution Functions ($k{\\rm NN}$-${\\rm CDF}$). This is the empirical cumulative distribution function of distances from a set of volume-filling, Poisson distributed random points to the $k$--nearest data points, and is sensitive to all connected $N$--point correlations in the data. The $k{\\rm NN}$-${\\rm CDF}$ can be used to measure counts in cell, void probability distributions and higher $N$--point correlation functions, all using the same formalism exploiting fast searches with spatial tree data structures. We demonstrate how it can be computed efficiently from various data sets - both discrete points, and the generalization for continuous fields. We use data from a large suite of $N$-body simulations to explore the sensitivity of this new statistic to various cosmological parameters, compared to the two-point correlation function, while using the same range of scales. We demonstrate that the use of $k{\\rm NN}$-${\\rm CDF}$ improves the constraints on the cosmological parameters by more than a factor of $2$ when applied to the clustering of dark matter in the range of scales between $10h^{-1}{\\rm Mpc}$ and $40h^{-1}{\\rm Mpc}$. We also show that relative improvement is even greater when applied on the same scales to the clustering of halos in the simulations at a fixed number density, both in real space, as well as in redshift space. Since the $k{\\rm NN}$-${\\rm CDF}$ are sensitive to all higher order connected correlation functions in the data, the gains over traditional two-point analyses are expected to grow as progressively smaller scales are included in the analysis of cosmological data.","PeriodicalId":8431,"journal":{"name":"arXiv: Cosmology and Nongalactic Astrophysics","volume":"78 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2020-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"39","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv: Cosmology and Nongalactic Astrophysics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/mnras/staa3604","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 39

Abstract

The use of summary statistics beyond the two-point correlation function to analyze the non-Gaussian clustering on small scales is an active field of research in cosmology. In this paper, we explore a set of new summary statistics -- the $k$-Nearest Neighbor Cumulative Distribution Functions ($k{\rm NN}$-${\rm CDF}$). This is the empirical cumulative distribution function of distances from a set of volume-filling, Poisson distributed random points to the $k$--nearest data points, and is sensitive to all connected $N$--point correlations in the data. The $k{\rm NN}$-${\rm CDF}$ can be used to measure counts in cell, void probability distributions and higher $N$--point correlation functions, all using the same formalism exploiting fast searches with spatial tree data structures. We demonstrate how it can be computed efficiently from various data sets - both discrete points, and the generalization for continuous fields. We use data from a large suite of $N$-body simulations to explore the sensitivity of this new statistic to various cosmological parameters, compared to the two-point correlation function, while using the same range of scales. We demonstrate that the use of $k{\rm NN}$-${\rm CDF}$ improves the constraints on the cosmological parameters by more than a factor of $2$ when applied to the clustering of dark matter in the range of scales between $10h^{-1}{\rm Mpc}$ and $40h^{-1}{\rm Mpc}$. We also show that relative improvement is even greater when applied on the same scales to the clustering of halos in the simulations at a fixed number density, both in real space, as well as in redshift space. Since the $k{\rm NN}$-${\rm CDF}$ are sensitive to all higher order connected correlation functions in the data, the gains over traditional two-point analyses are expected to grow as progressively smaller scales are included in the analysis of cosmological data.

查看原文本刊更多论文

最近邻分布:宇宙聚类的新统计方法

利用两点相关函数以外的汇总统计来分析小尺度上的非高斯聚类是宇宙学研究的一个活跃领域。在本文中，我们探索了一组新的汇总统计——$k$-最近邻累积分布函数($k{\rm NN}$-${\rm CDF}$)。这是从一组体积填充、泊松分布随机点到最近的k个数据点的距离的经验累积分布函数，并且对数据中所有连接的N个点的相关性敏感。$k{\rm NN}$-${\rm CDF}$可用于测量单元数、空概率分布和更高的$N$-点相关函数，所有这些都使用相同的形式，利用空间树数据结构进行快速搜索。我们演示了如何从各种数据集有效地计算它-无论是离散点，还是连续域的泛化。我们使用了大量的$N$体模拟数据，在使用相同尺度范围的情况下，与两点相关函数相比，探索了这一新的统计数据对各种宇宙学参数的敏感性。我们证明了在$10h^{-1}{\rm Mpc}$和$40h^{-1}{\rm Mpc}$之间的尺度范围内，使用$k{\rm NN}$-${\rm CDF}$将宇宙学参数的约束提高了$2$以上。我们还表明，当在相同的尺度上应用于固定数量密度的模拟光晕聚类时，无论是在真实空间还是在红移空间中，相对的改进都更大。由于$k{\rm NN}$-${\rm CDF}$对数据中所有高阶连接的相关函数都很敏感，因此随着宇宙学数据分析中包含越来越小的尺度，传统两点分析的收益预计会增加。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

arXiv: Cosmology and Nongalactic Astrophysics

自引率

0.00%

发文量