{"title":"A novel approach for variable star classification based on imbalanced learning","authors":"Jingyi Zhang, Yanxia Zhang, Zihan Kang, Changhua Li, Yihan Tao, Yongheng Zhao, Xue-bing Wu","doi":"10.1017/pasa.2023.35","DOIUrl":null,"url":null,"abstract":"Abstract The advent of time-domain sky surveys has generated a vast amount of light variation data, enabling astronomers to investigate variable stars with large-scale samples. However, this also poses new opportunities and challenges for the time-domain research. In this paper, we focus on the classification of variable stars from the Catalina Surveys Data Release 2 and propose an imbalanced learning classifier based on Self-paced Ensemble (SPE) method. Compared with the work of Hosenie et al. (2020), our approach significantly enhances the classification Recall of Blazhko RR Lyrae stars from 12% to 85%, mixed-mode RR Lyrae variables from 29% to 64%, detached binaries from 68% to 97%, and LPV from 87% to 99%. SPE demonstrates a rather good performance on most of the variable classes except RRab, RRc, and contact and semi-detached binary. Moreover, the results suggest that SPE tends to target the minority classes of objects, while Random Forest is more effective in finding the majority classes. To balance the overall classification accuracy, we construct a Voting Classifier that combines the strengths of SPE and Random Forest. The results show that the Voting Classifier can achieve a balanced performance across all classes with minimal loss of accuracy. In summary, the SPE algorithm and Voting Classifier are superior to traditional machine learning methods and can be well applied to classify the periodic variable stars. This paper contributes to the current research on imbalanced learning in astronomy and can also be extended to the time-domain data of other larger sky survey projects (LSST, etc.).","PeriodicalId":20753,"journal":{"name":"Publications of the Astronomical Society of Australia","volume":"9 1","pages":""},"PeriodicalIF":4.5000,"publicationDate":"2023-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Publications of the Astronomical Society of Australia","FirstCategoryId":"101","ListUrlMain":"https://doi.org/10.1017/pasa.2023.35","RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ASTRONOMY & ASTROPHYSICS","Score":null,"Total":0}
引用次数: 0
Abstract
Abstract The advent of time-domain sky surveys has generated a vast amount of light variation data, enabling astronomers to investigate variable stars with large-scale samples. However, this also poses new opportunities and challenges for the time-domain research. In this paper, we focus on the classification of variable stars from the Catalina Surveys Data Release 2 and propose an imbalanced learning classifier based on Self-paced Ensemble (SPE) method. Compared with the work of Hosenie et al. (2020), our approach significantly enhances the classification Recall of Blazhko RR Lyrae stars from 12% to 85%, mixed-mode RR Lyrae variables from 29% to 64%, detached binaries from 68% to 97%, and LPV from 87% to 99%. SPE demonstrates a rather good performance on most of the variable classes except RRab, RRc, and contact and semi-detached binary. Moreover, the results suggest that SPE tends to target the minority classes of objects, while Random Forest is more effective in finding the majority classes. To balance the overall classification accuracy, we construct a Voting Classifier that combines the strengths of SPE and Random Forest. The results show that the Voting Classifier can achieve a balanced performance across all classes with minimal loss of accuracy. In summary, the SPE algorithm and Voting Classifier are superior to traditional machine learning methods and can be well applied to classify the periodic variable stars. This paper contributes to the current research on imbalanced learning in astronomy and can also be extended to the time-domain data of other larger sky survey projects (LSST, etc.).
期刊介绍:
Publications of the Astronomical Society of Australia (PASA) publishes new and significant research in astronomy and astrophysics. PASA covers a wide range of topics within astronomy, including multi-wavelength observations, theoretical modelling, computational astronomy and visualisation. PASA also maintains its heritage of publishing results on southern hemisphere astronomy and on astronomy with Australian facilities.
PASA publishes research papers, review papers and special series on topical issues, making use of expert international reviewers and an experienced Editorial Board. As an electronic-only journal, PASA publishes paper by paper, ensuring a rapid publication rate. There are no page charges. PASA''s Editorial Board approve a certain number of papers per year to be published Open Access without a publication fee.