Sampling with prior knowledge for high-dimensional gravitational wave data analysis

IF 6.2 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Big Data Mining and Analytics Pub Date : 2021-12-27 DOI:10.26599/BDMA.2021.9020018

He Wang;Zhoujian Cao;Yue Zhou;Zong-Kuan Guo;Zhixiang Ren

{"title":"Sampling with prior knowledge for high-dimensional gravitational wave data analysis","authors":"He Wang;Zhoujian Cao;Yue Zhou;Zong-Kuan Guo;Zhixiang Ren","doi":"10.26599/BDMA.2021.9020018","DOIUrl":null,"url":null,"abstract":"Extracting knowledge from high-dimensional data has been notoriously difficult, primarily due to the so-called \"curse of dimensionality\" and the complex joint distributions of these dimensions. This is a particularly profound issue for high-dimensional gravitational wave data analysis where one requires to conduct Bayesian inference and estimate joint posterior distributions. In this study, we incorporate prior physical knowledge by sampling from desired interim distributions to develop the training dataset. Accordingly, the more relevant regions of the high-dimensional feature space are covered by additional data points, such that the model can learn the subtle but important details. We adapt the normalizing flow method to be more expressive and trainable, such that the information can be effectively extracted and represented by the transformation between the prior and target distributions. Once trained, our model only takes approximately 1 s on one V100 GPU to generate thousands of samples for probabilistic inference purposes. The evaluation of our approach confirms the efficacy and efficiency of gravitational wave data inferences and points to a promising direction for similar research. The source code, specifications, and detailed procedures are publicly accessible on GitHub.","PeriodicalId":52355,"journal":{"name":"Big Data Mining and Analytics","volume":"5 1","pages":"53-63"},"PeriodicalIF":6.2000,"publicationDate":"2021-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/8254253/9663253/09663260.pdf","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Big Data Mining and Analytics","FirstCategoryId":"1093","ListUrlMain":"https://ieeexplore.ieee.org/document/9663260/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 3

Abstract

Extracting knowledge from high-dimensional data has been notoriously difficult, primarily due to the so-called "curse of dimensionality" and the complex joint distributions of these dimensions. This is a particularly profound issue for high-dimensional gravitational wave data analysis where one requires to conduct Bayesian inference and estimate joint posterior distributions. In this study, we incorporate prior physical knowledge by sampling from desired interim distributions to develop the training dataset. Accordingly, the more relevant regions of the high-dimensional feature space are covered by additional data points, such that the model can learn the subtle but important details. We adapt the normalizing flow method to be more expressive and trainable, such that the information can be effectively extracted and represented by the transformation between the prior and target distributions. Once trained, our model only takes approximately 1 s on one V100 GPU to generate thousands of samples for probabilistic inference purposes. The evaluation of our approach confirms the efficacy and efficiency of gravitational wave data inferences and points to a promising direction for similar research. The source code, specifications, and detailed procedures are publicly accessible on GitHub.

查看原文本刊更多论文

高维引力波数据分析的先验知识采样

从高维数据中提取知识一直是出了名的困难，主要是由于所谓的“维度诅咒”和这些维度的复杂联合分布。对于高维引力波数据分析来说，这是一个特别深刻的问题，需要进行贝叶斯推理并估计联合后验分布。在这项研究中，我们通过从期望的中期分布中采样来结合先前的物理知识，以开发训练数据集。因此，高维特征空间中更相关的区域被额外的数据点覆盖，使得模型可以学习细微但重要的细节。我们对归一化流方法进行了调整，使其更具表达性和可训练性，从而可以通过先验分布和目标分布之间的转换来有效地提取和表示信息。一旦经过训练，我们的模型在一个V100 GPU上只需要大约1秒就可以生成数千个样本用于概率推理。对我们方法的评估证实了引力波数据推断的有效性和效率，并为类似研究指明了一个有希望的方向。源代码、规范和详细过程可在GitHub上公开访问。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Big Data Mining and Analytics Computer Science-Computer Science Applications

CiteScore

20.90

自引率

2.20%

发文量

期刊介绍： Big Data Mining and Analytics, a publication by Tsinghua University Press, presents groundbreaking research in the field of big data research and its applications. This comprehensive book delves into the exploration and analysis of vast amounts of data from diverse sources to uncover hidden patterns, correlations, insights, and knowledge. Featuring the latest developments, research issues, and solutions, this book offers valuable insights into the world of big data. It provides a deep understanding of data mining techniques, data analytics, and their practical applications. Big Data Mining and Analytics has gained significant recognition and is indexed and abstracted in esteemed platforms such as ESCI, EI, Scopus, DBLP Computer Science, Google Scholar, INSPEC, CSCD, DOAJ, CNKI, and more. With its wealth of information and its ability to transform the way we perceive and utilize data, this book is a must-read for researchers, professionals, and anyone interested in the field of big data analytics.