A novel approach for variable star classification based on imbalanced learning

IF 4.5 3区 物理与天体物理 Q1 ASTRONOMY & ASTROPHYSICS
Jingyi Zhang, Yanxia Zhang, Zihan Kang, Changhua Li, Yihan Tao, Yongheng Zhao, Xue-bing Wu
{"title":"A novel approach for variable star classification based on imbalanced learning","authors":"Jingyi Zhang, Yanxia Zhang, Zihan Kang, Changhua Li, Yihan Tao, Yongheng Zhao, Xue-bing Wu","doi":"10.1017/pasa.2023.35","DOIUrl":null,"url":null,"abstract":"Abstract The advent of time-domain sky surveys has generated a vast amount of light variation data, enabling astronomers to investigate variable stars with large-scale samples. However, this also poses new opportunities and challenges for the time-domain research. In this paper, we focus on the classification of variable stars from the Catalina Surveys Data Release 2 and propose an imbalanced learning classifier based on Self-paced Ensemble (SPE) method. Compared with the work of Hosenie et al. (2020), our approach significantly enhances the classification Recall of Blazhko RR Lyrae stars from 12% to 85%, mixed-mode RR Lyrae variables from 29% to 64%, detached binaries from 68% to 97%, and LPV from 87% to 99%. SPE demonstrates a rather good performance on most of the variable classes except RRab, RRc, and contact and semi-detached binary. Moreover, the results suggest that SPE tends to target the minority classes of objects, while Random Forest is more effective in finding the majority classes. To balance the overall classification accuracy, we construct a Voting Classifier that combines the strengths of SPE and Random Forest. The results show that the Voting Classifier can achieve a balanced performance across all classes with minimal loss of accuracy. In summary, the SPE algorithm and Voting Classifier are superior to traditional machine learning methods and can be well applied to classify the periodic variable stars. This paper contributes to the current research on imbalanced learning in astronomy and can also be extended to the time-domain data of other larger sky survey projects (LSST, etc.).","PeriodicalId":20753,"journal":{"name":"Publications of the Astronomical Society of Australia","volume":null,"pages":null},"PeriodicalIF":4.5000,"publicationDate":"2023-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Publications of the Astronomical Society of Australia","FirstCategoryId":"101","ListUrlMain":"https://doi.org/10.1017/pasa.2023.35","RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ASTRONOMY & ASTROPHYSICS","Score":null,"Total":0}
引用次数: 0

Abstract

Abstract The advent of time-domain sky surveys has generated a vast amount of light variation data, enabling astronomers to investigate variable stars with large-scale samples. However, this also poses new opportunities and challenges for the time-domain research. In this paper, we focus on the classification of variable stars from the Catalina Surveys Data Release 2 and propose an imbalanced learning classifier based on Self-paced Ensemble (SPE) method. Compared with the work of Hosenie et al. (2020), our approach significantly enhances the classification Recall of Blazhko RR Lyrae stars from 12% to 85%, mixed-mode RR Lyrae variables from 29% to 64%, detached binaries from 68% to 97%, and LPV from 87% to 99%. SPE demonstrates a rather good performance on most of the variable classes except RRab, RRc, and contact and semi-detached binary. Moreover, the results suggest that SPE tends to target the minority classes of objects, while Random Forest is more effective in finding the majority classes. To balance the overall classification accuracy, we construct a Voting Classifier that combines the strengths of SPE and Random Forest. The results show that the Voting Classifier can achieve a balanced performance across all classes with minimal loss of accuracy. In summary, the SPE algorithm and Voting Classifier are superior to traditional machine learning methods and can be well applied to classify the periodic variable stars. This paper contributes to the current research on imbalanced learning in astronomy and can also be extended to the time-domain data of other larger sky survey projects (LSST, etc.).
一种基于不平衡学习的变星分类新方法
时域巡天的出现产生了大量的光变化数据,使天文学家能够用大尺度的样本来研究变星。然而,这也给时域研究带来了新的机遇和挑战。本文以Catalina survey Data Release 2中的变星分类为研究对象,提出了一种基于自同步集成(self -pace Ensemble, SPE)方法的不平衡学习分类器。与Hosenie et al.(2020)的工作相比,我们的方法显著提高了Blazhko RR Lyrae恒星的分类召回率,从12%提高到85%,混合模式RR Lyrae变量从29%提高到64%,分离双星从68%提高到97%,LPV从87%提高到99%。SPE在除RRab、RRc以及接触和半分离二进制之外的大多数变量类上都表现出相当好的性能。此外,结果表明SPE倾向于针对对象的少数类,而随机森林在寻找多数类方面更有效。为了平衡整体分类精度,我们构建了一个结合SPE和随机森林优势的投票分类器。结果表明,投票分类器可以在最小的准确性损失的情况下实现所有类的平衡性能。综上所述,SPE算法和投票分类器优于传统的机器学习方法,可以很好地应用于周期变星的分类。本文对当前天文学中不平衡学习的研究有一定的贡献,也可以推广到其他大型巡天项目(LSST等)的时域数据。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Publications of the Astronomical Society of Australia
Publications of the Astronomical Society of Australia 地学天文-天文与天体物理
CiteScore
5.90
自引率
9.50%
发文量
41
审稿时长
>12 weeks
期刊介绍: Publications of the Astronomical Society of Australia (PASA) publishes new and significant research in astronomy and astrophysics. PASA covers a wide range of topics within astronomy, including multi-wavelength observations, theoretical modelling, computational astronomy and visualisation. PASA also maintains its heritage of publishing results on southern hemisphere astronomy and on astronomy with Australian facilities. PASA publishes research papers, review papers and special series on topical issues, making use of expert international reviewers and an experienced Editorial Board. As an electronic-only journal, PASA publishes paper by paper, ensuring a rapid publication rate. There are no page charges. PASA''s Editorial Board approve a certain number of papers per year to be published Open Access without a publication fee.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信