在窄带巡天之外寻找莱曼α发射星系的梯度增强和宽带方法

IF 5.8 2区 物理与天体物理 Q1 ASTRONOMY & ASTROPHYSICS
A. Vale, A. Paulino-Afonso, A. Humphrey, P. A. C. Cunha, B. Ribeiro, B. Cerqueira, R. Carvajal, J. Fonseca
{"title":"在窄带巡天之外寻找莱曼α发射星系的梯度增强和宽带方法","authors":"A. Vale, A. Paulino-Afonso, A. Humphrey, P. A. C. Cunha, B. Ribeiro, B. Cerqueira, R. Carvajal, J. Fonseca","doi":"10.1051/0004-6361/202555170","DOIUrl":null,"url":null,"abstract":"<i>Context.<i/> The identification of Lyman-<i>α<i/> emitting galaxies (LAEs) has traditionally relied on dedicated surveys using custom narrowband filters, which constrain observations to specific narrow redshift intervals, or on blind spectroscopy, which although unbiased, typically requires extensive telescope time. This makes it challenging to assemble large statistically robust galaxy samples. With the advent of wide-area astronomical surveys producing datasets that are significantly larger than traditional surveys, the need for new techniques arises.<i>Aims.<i/> We test whether gradient-boosting algorithms, trained on broadband photometric data from traditional LAE surveys, can efficiently and accurately identify LAE candidates from typical star-forming galaxies at similar redshifts and brightness levels.<i>Methods.<i/> Using galaxy samples at <i>z<i/> ∈ [2, 6] derived from the COSMOS2020 and SC4K catalogs, we trained gradient-boosting machine-learning algorithms (LGBM, XGBoost, and CatBoost) using optical and near-infrared broadband photometry. To ensure balanced performance, the models were trained on carefully selected datasets with similar redshift and <i>i<i/>-band magnitude distributions. Additionally, the models were tested for robustness by perturbing the photometric data using the associated observational uncertainties.<i>Results.<i/> Our classification models achieved F1-scores of ∼87% and successfully identified about 7000 objects with an unanimous agreement across all models. This more than doubles the number of LAEs identified in the COSMOS field compared with the SC4K dataset. We managed to spectroscopically confirm 60 of these LAE candidates using the publicly available catalogs in the COSMOS field.<i>Conclusions.<i/> These results highlight the potential of machine learning in efficiently identifying LAEs candidates. This lays the foundations for applications to larger photometric surveys, such as Euclid and LSST. By complementing traditional approaches and providing robust preselection capabilities, our models facilitate the analysis of these objects. This is crucial to increase our knowledge of the overall LAE population.","PeriodicalId":8571,"journal":{"name":"Astronomy & Astrophysics","volume":"15 1","pages":""},"PeriodicalIF":5.8000,"publicationDate":"2025-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A gradient boosting and broadband approach to finding Lyman-α emitting galaxies beyond narrowband surveys\",\"authors\":\"A. Vale, A. Paulino-Afonso, A. Humphrey, P. A. C. Cunha, B. Ribeiro, B. Cerqueira, R. Carvajal, J. Fonseca\",\"doi\":\"10.1051/0004-6361/202555170\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<i>Context.<i/> The identification of Lyman-<i>α<i/> emitting galaxies (LAEs) has traditionally relied on dedicated surveys using custom narrowband filters, which constrain observations to specific narrow redshift intervals, or on blind spectroscopy, which although unbiased, typically requires extensive telescope time. This makes it challenging to assemble large statistically robust galaxy samples. With the advent of wide-area astronomical surveys producing datasets that are significantly larger than traditional surveys, the need for new techniques arises.<i>Aims.<i/> We test whether gradient-boosting algorithms, trained on broadband photometric data from traditional LAE surveys, can efficiently and accurately identify LAE candidates from typical star-forming galaxies at similar redshifts and brightness levels.<i>Methods.<i/> Using galaxy samples at <i>z<i/> ∈ [2, 6] derived from the COSMOS2020 and SC4K catalogs, we trained gradient-boosting machine-learning algorithms (LGBM, XGBoost, and CatBoost) using optical and near-infrared broadband photometry. To ensure balanced performance, the models were trained on carefully selected datasets with similar redshift and <i>i<i/>-band magnitude distributions. Additionally, the models were tested for robustness by perturbing the photometric data using the associated observational uncertainties.<i>Results.<i/> Our classification models achieved F1-scores of ∼87% and successfully identified about 7000 objects with an unanimous agreement across all models. This more than doubles the number of LAEs identified in the COSMOS field compared with the SC4K dataset. We managed to spectroscopically confirm 60 of these LAE candidates using the publicly available catalogs in the COSMOS field.<i>Conclusions.<i/> These results highlight the potential of machine learning in efficiently identifying LAEs candidates. This lays the foundations for applications to larger photometric surveys, such as Euclid and LSST. By complementing traditional approaches and providing robust preselection capabilities, our models facilitate the analysis of these objects. This is crucial to increase our knowledge of the overall LAE population.\",\"PeriodicalId\":8571,\"journal\":{\"name\":\"Astronomy & Astrophysics\",\"volume\":\"15 1\",\"pages\":\"\"},\"PeriodicalIF\":5.8000,\"publicationDate\":\"2025-09-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Astronomy & Astrophysics\",\"FirstCategoryId\":\"101\",\"ListUrlMain\":\"https://doi.org/10.1051/0004-6361/202555170\",\"RegionNum\":2,\"RegionCategory\":\"物理与天体物理\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ASTRONOMY & ASTROPHYSICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Astronomy & Astrophysics","FirstCategoryId":"101","ListUrlMain":"https://doi.org/10.1051/0004-6361/202555170","RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ASTRONOMY & ASTROPHYSICS","Score":null,"Total":0}
引用次数: 0

摘要

上下文。莱曼α发射星系(LAEs)的识别传统上依赖于使用定制窄带滤光片的专门调查,这将观测限制在特定的窄红移间隔内,或者依赖于盲光谱,尽管无偏,但通常需要大量的望远镜时间。这使得收集大型统计上可靠的星系样本变得具有挑战性。随着广域天文调查的出现,产生的数据集比传统调查大得多,对新技术的需求就出现了。我们测试了基于传统LAE调查的宽带光度数据训练的梯度增强算法,是否能够有效、准确地从具有相似红移和亮度水平的典型恒星形成星系中识别LAE候选者。使用来自COSMOS2020和SC4K目录的z∈[2,6]的星系样本,我们使用光学和近红外宽带测光技术训练梯度增强机器学习算法(LGBM, XGBoost和CatBoost)。为了确保平衡的性能,模型在具有相似红移和i波段星等分布的精心选择的数据集上进行训练。此外,通过使用相关的观测不确定性扰动光度数据,对模型的稳健性进行了测试。我们的分类模型达到了f1 - 87%的分数,并成功地识别了大约7000个物体,所有模型都一致同意。与SC4K数据集相比,这是COSMOS字段中识别的lae数量的两倍多。我们利用COSMOS领域中公开可用的星表,成功地在光谱上确认了60个LAE候选者。这些结果突出了机器学习在有效识别LAEs候选人方面的潜力。这为更大规模的光度测量(如欧几里得和LSST)的应用奠定了基础。通过补充传统方法和提供强大的预选能力,我们的模型促进了这些对象的分析。这对于增加我们对LAE总体人口的了解至关重要。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A gradient boosting and broadband approach to finding Lyman-α emitting galaxies beyond narrowband surveys
Context. The identification of Lyman-α emitting galaxies (LAEs) has traditionally relied on dedicated surveys using custom narrowband filters, which constrain observations to specific narrow redshift intervals, or on blind spectroscopy, which although unbiased, typically requires extensive telescope time. This makes it challenging to assemble large statistically robust galaxy samples. With the advent of wide-area astronomical surveys producing datasets that are significantly larger than traditional surveys, the need for new techniques arises.Aims. We test whether gradient-boosting algorithms, trained on broadband photometric data from traditional LAE surveys, can efficiently and accurately identify LAE candidates from typical star-forming galaxies at similar redshifts and brightness levels.Methods. Using galaxy samples at z ∈ [2, 6] derived from the COSMOS2020 and SC4K catalogs, we trained gradient-boosting machine-learning algorithms (LGBM, XGBoost, and CatBoost) using optical and near-infrared broadband photometry. To ensure balanced performance, the models were trained on carefully selected datasets with similar redshift and i-band magnitude distributions. Additionally, the models were tested for robustness by perturbing the photometric data using the associated observational uncertainties.Results. Our classification models achieved F1-scores of ∼87% and successfully identified about 7000 objects with an unanimous agreement across all models. This more than doubles the number of LAEs identified in the COSMOS field compared with the SC4K dataset. We managed to spectroscopically confirm 60 of these LAE candidates using the publicly available catalogs in the COSMOS field.Conclusions. These results highlight the potential of machine learning in efficiently identifying LAEs candidates. This lays the foundations for applications to larger photometric surveys, such as Euclid and LSST. By complementing traditional approaches and providing robust preselection capabilities, our models facilitate the analysis of these objects. This is crucial to increase our knowledge of the overall LAE population.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Astronomy & Astrophysics
Astronomy & Astrophysics 地学天文-天文与天体物理
CiteScore
10.20
自引率
27.70%
发文量
2105
审稿时长
1-2 weeks
期刊介绍: Astronomy & Astrophysics is an international Journal that publishes papers on all aspects of astronomy and astrophysics (theoretical, observational, and instrumental) independently of the techniques used to obtain the results.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信