LyαNNA: A deep learning field-level inference machine for the Lyman-α forest

Astronomy & Astrophysics Pub Date : 2024-07-25 DOI:10.1051/0004-6361/202348485

Parth Nayak, Michael Walther, Daniel Gruen, Sreyas Adiraju

{"title":"LyαNNA: A deep learning field-level inference machine for the Lyman-α forest","authors":"Parth Nayak, Michael Walther, Daniel Gruen, Sreyas Adiraju","doi":"10.1051/0004-6361/202348485","DOIUrl":null,"url":null,"abstract":"The inference of astrophysical and cosmological properties from the Lyman-alpha forest conventionally relies on summary statistics of the transmission field that carry useful but limited information. We present a deep learning framework for inference from the Lyman-alpha forest at the field level. This framework consists of a 1D residual convolutional neural network (ResNet) that extracts spectral features and performs regression on thermal parameters of the intergalactic medium that characterize the power-law temperature-density relation. We trained this supervised machinery using a large set of mock absorption spectra from Nyx hydrodynamic simulations at $z=2.2$ with a range of thermal parameter combinations (labels). We employed Bayesian optimization to find an optimal set of hyperparameters for our network, and then employed a committee of 20 neural networks for increased statistical robustness of the network inference. In addition to the parameter point predictions, our machine also provides a self-consistent estimate of their covariance matrix with which we constructed a pipeline for inferring the posterior distribution of the parameters. We compared the results of our framework with the traditional summary based approach, namely the power spectrum and the probability density function (PDF) of transmission, in terms of the area of the 68 credibility regions as our figure of merit (FoM). In our study of the information content of perfect (noise- and systematics-free) lya forest spectral datasets, we find a significant tightening of the posterior constraints --- factors of 10.92 and 3.30 in FoM over the power spectrum only and jointly with PDF, respectively --- which is the consequence of recovering the relevant parts of information that are not carried by the classical summary statistics.","PeriodicalId":8585,"journal":{"name":"Astronomy & Astrophysics","volume":"7 2","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Astronomy & Astrophysics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1051/0004-6361/202348485","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

The inference of astrophysical and cosmological properties from the Lyman-alpha forest conventionally relies on summary statistics of the transmission field that carry useful but limited information. We present a deep learning framework for inference from the Lyman-alpha forest at the field level. This framework consists of a 1D residual convolutional neural network (ResNet) that extracts spectral features and performs regression on thermal parameters of the intergalactic medium that characterize the power-law temperature-density relation. We trained this supervised machinery using a large set of mock absorption spectra from Nyx hydrodynamic simulations at $z=2.2$ with a range of thermal parameter combinations (labels). We employed Bayesian optimization to find an optimal set of hyperparameters for our network, and then employed a committee of 20 neural networks for increased statistical robustness of the network inference. In addition to the parameter point predictions, our machine also provides a self-consistent estimate of their covariance matrix with which we constructed a pipeline for inferring the posterior distribution of the parameters. We compared the results of our framework with the traditional summary based approach, namely the power spectrum and the probability density function (PDF) of transmission, in terms of the area of the 68 credibility regions as our figure of merit (FoM). In our study of the information content of perfect (noise- and systematics-free) lya forest spectral datasets, we find a significant tightening of the posterior constraints --- factors of 10.92 and 3.30 in FoM over the power spectrum only and jointly with PDF, respectively --- which is the consequence of recovering the relevant parts of information that are not carried by the classical summary statistics.

查看原文本刊更多论文

LyαNNA：用于莱曼-α森林的深度学习场级推理机

从莱曼-阿尔法森林推断天体物理和宇宙学特性，传统上依赖于传输场的汇总统计，这些统计信息有用但有限。我们提出了一种深度学习框架，用于从莱曼-阿尔法森林进行场级推断。该框架由一维残差卷积神经网络（ResNet）组成，可提取光谱特征，并对星系际介质的热参数进行回归，这些热参数是幂律温度-密度关系的特征。我们使用了大量来自 Nyx 流体动力模拟的模拟吸收光谱，这些光谱在 $z=2.2$ 时具有一系列热参数组合（标签）。我们采用贝叶斯优化法为我们的网络找到一组最佳超参数，然后采用由 20 个神经网络组成的委员会来提高网络推断的统计稳健性。除了参数点预测外，我们的机器还提供了参数协方差矩阵的自洽估计值，我们利用该估计值构建了推断参数后验分布的管道。我们将我们框架的结果与传统的基于摘要的方法（即功率谱和传输的概率密度函数 (PDF)）进行了比较，以 68 个可信度区域的面积作为我们的优点图 (FoM)。在我们对完美（无噪声和无系统性）ya 森林光谱数据集的信息含量进行的研究中，我们发现后验约束显著收紧--仅功率谱的 FoM 因数为 10.92，与 PDF 的 FoM 因数为 3.30，这是恢复经典摘要统计未包含的相关信息部分的结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Astronomy & Astrophysics

自引率

0.00%

发文量