Comparative analysis of machine learning techniques for feature selection and classification of Fast Radio Bursts

IF 10.5 4区 物理与天体物理 Q1 ASTRONOMY & ASTROPHYSICS
Ailton J.B. Júnior , Jéferson A.S. Fortunato , Leonardo J. Silvestre , Thonimar V. Alencar , Wiliam S. Hipólito-Ricaldi
{"title":"Comparative analysis of machine learning techniques for feature selection and classification of Fast Radio Bursts","authors":"Ailton J.B. Júnior ,&nbsp;Jéferson A.S. Fortunato ,&nbsp;Leonardo J. Silvestre ,&nbsp;Thonimar V. Alencar ,&nbsp;Wiliam S. Hipólito-Ricaldi","doi":"10.1016/j.jheap.2025.100449","DOIUrl":null,"url":null,"abstract":"<div><div>Fast Radio Bursts (FRBs) are millisecond-duration radio transients of extragalactic origin, exhibiting a wide range of physical and observational properties. Distinguishing between repeating and non-repeating FRBs remains a key challenge in understanding their nature. In this work, we apply unsupervised machine learning techniques to classify FRBs based on both primary observables from the CHIME catalog and physically motivated derived features. We evaluate three hybrid pipelines combining dimensionality reduction with clustering: Principal Component Analysis (PCA) + k-means, t-distributed Stochastic Neighbor Embedding (t-SNE) + Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN), and t-SNE + Spectral Clustering. To identify optimal hyperparameters, we implement a comprehensive grid search using a custom scoring function that prioritizes recall while penalizing excessive cluster fragmentation and noise. Feature relevance is assessed using principal component loadings, mutual information with the known repeater label, and permutation-based F<sub>2</sub> score sensitivity. Our results demonstrate that the derived features, including redshift, luminosity, and spectral properties, such as the spectral index and the spectral running, significantly enhance the classification performance. Finally, we identify a set of FRBs currently labeled as non-repeaters that consistently cluster with known repeaters across all methods, highlighting promising candidates for future follow-up observations and reinforcing the utility of unsupervised approaches in FRB population studies.</div></div>","PeriodicalId":54265,"journal":{"name":"Journal of High Energy Astrophysics","volume":"49 ","pages":"Article 100449"},"PeriodicalIF":10.5000,"publicationDate":"2025-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of High Energy Astrophysics","FirstCategoryId":"101","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2214404825001302","RegionNum":4,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ASTRONOMY & ASTROPHYSICS","Score":null,"Total":0}
引用次数: 0

Abstract

Fast Radio Bursts (FRBs) are millisecond-duration radio transients of extragalactic origin, exhibiting a wide range of physical and observational properties. Distinguishing between repeating and non-repeating FRBs remains a key challenge in understanding their nature. In this work, we apply unsupervised machine learning techniques to classify FRBs based on both primary observables from the CHIME catalog and physically motivated derived features. We evaluate three hybrid pipelines combining dimensionality reduction with clustering: Principal Component Analysis (PCA) + k-means, t-distributed Stochastic Neighbor Embedding (t-SNE) + Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN), and t-SNE + Spectral Clustering. To identify optimal hyperparameters, we implement a comprehensive grid search using a custom scoring function that prioritizes recall while penalizing excessive cluster fragmentation and noise. Feature relevance is assessed using principal component loadings, mutual information with the known repeater label, and permutation-based F2 score sensitivity. Our results demonstrate that the derived features, including redshift, luminosity, and spectral properties, such as the spectral index and the spectral running, significantly enhance the classification performance. Finally, we identify a set of FRBs currently labeled as non-repeaters that consistently cluster with known repeaters across all methods, highlighting promising candidates for future follow-up observations and reinforcing the utility of unsupervised approaches in FRB population studies.
快速射电暴特征选择与分类的机器学习技术比较分析
快速射电暴(frb)是起源于银河系外的毫秒持续时间的射电瞬变,表现出广泛的物理和观测特性。区分重复和非重复快速射电暴仍然是理解其本质的一个关键挑战。在这项工作中,我们应用无监督机器学习技术,根据CHIME目录中的主要观测值和物理动机衍生特征对快速射电暴进行分类。我们评估了三种结合降维和聚类的混合管道:主成分分析(PCA) + k-means、t分布随机邻居嵌入(t-SNE) +基于层次密度的空间聚类(HDBSCAN)和t-SNE +光谱聚类。为了识别最佳的超参数,我们使用自定义评分函数实现了一个全面的网格搜索,该函数优先考虑召回,同时惩罚过多的集群碎片和噪声。特征相关性通过主成分负荷、已知中继器标签的互信息和基于排列的F2评分敏感性来评估。我们的研究结果表明,衍生的特征,包括红移、光度,以及光谱指数和光谱运行等光谱特性,显著提高了分类性能。最后,我们确定了一组目前被标记为非重复的快速射电暴,它们在所有方法中都与已知的重复射电暴聚集在一起,突出了未来后续观察的有希望的候选者,并加强了无监督方法在快速射电暴种群研究中的实用性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of High Energy Astrophysics
Journal of High Energy Astrophysics Earth and Planetary Sciences-Space and Planetary Science
CiteScore
9.70
自引率
5.30%
发文量
38
审稿时长
65 days
期刊介绍: The journal welcomes manuscripts on theoretical models, simulations, and observations of highly energetic astrophysical objects both in our Galaxy and beyond. Among those, black holes at all scales, neutron stars, pulsars and their nebula, binaries, novae and supernovae, their remnants, active galaxies, and clusters are just a few examples. The journal will consider research across the whole electromagnetic spectrum, as well as research using various messengers, such as gravitational waves or neutrinos. Effects of high-energy phenomena on cosmology and star-formation, results from dedicated surveys expanding the knowledge of extreme environments, and astrophysical implications of dark matter are also welcomed topics.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信