Asteroid family classification with machine learning: Investigative analysis of a novel two-step approach for categorizing known small asteroid families⋆

IF 2.2 3区物理与天体物理 Q2 ASTRONOMY & ASTROPHYSICS

Experimental Astronomy Pub Date : 2025-01-31 DOI:10.1007/s10686-025-09982-y

Fatin Abrar Shams, Abdullah Al Mahmud Nafiz, Md. Salman Mohosheu, Maheen Mashrur Hoque, Samiur Rashid Abir, Rashed Hasan Ratul, Md. Mushfiqur Rahman Mushfique, Aftab Ibn Nazim, Rubaiat Rehman Khan, Md Mahmudunnobe, Mohsinul Kabir

{"title":"Asteroid family classification with machine learning: Investigative analysis of a novel two-step approach for categorizing known small asteroid families⋆","authors":"Fatin Abrar Shams, Abdullah Al Mahmud Nafiz, Md. Salman Mohosheu, Maheen Mashrur Hoque, Samiur Rashid Abir, Rashed Hasan Ratul, Md. Mushfiqur Rahman Mushfique, Aftab Ibn Nazim, Rubaiat Rehman Khan, Md Mahmudunnobe, Mohsinul Kabir","doi":"10.1007/s10686-025-09982-y","DOIUrl":null,"url":null,"abstract":"<div><p>The term “asteroid family” refers to a collection of asteroids that share similar proper orbital elements such as semi-major axis, eccentricities, and orbital inclinations. Detecting small asteroid families has proved to be a challenge for a long time because of their extremely low sample size. In general, standalone machine learning classifiers tend to exhibit a bias towards classes with larger sample sizes, resulting in the inadequate classification of asteroid families with limited data. In this paper, a two-step supervised model was proposed for the effective classification of the asteroid families, especially for the tiny, small, and lower groups of medium asteroid families. The proposed model uses two-step classification in an attempt to resolve the challenges that come with the imbalanced dataset where at first a binary classification of small and large families was done with an XGB (Extreme Gradient boosting) classifier and then in the second stage Random Forest classifier was used alongside previously identified binary features to classify asteroid families. The proposed model performed better with higher F1 scores for tiny and small asteroid families compared to other algorithms tested in this work. It also achieved a perfect F1 score for 90 families, among 112 families which were tested. As for the lower group of medium sized asteroid families, it performed slightly worse compared to the previously used machine learning algorithms. Along with this, four dataset imbalance handling techniques have been employed in this work and compared to the proposed algorithm.</p></div>","PeriodicalId":551,"journal":{"name":"Experimental Astronomy","volume":"59 1","pages":""},"PeriodicalIF":2.2000,"publicationDate":"2025-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Experimental Astronomy","FirstCategoryId":"101","ListUrlMain":"https://link.springer.com/article/10.1007/s10686-025-09982-y","RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ASTRONOMY & ASTROPHYSICS","Score":null,"Total":0}

引用次数: 0

Abstract

The term “asteroid family” refers to a collection of asteroids that share similar proper orbital elements such as semi-major axis, eccentricities, and orbital inclinations. Detecting small asteroid families has proved to be a challenge for a long time because of their extremely low sample size. In general, standalone machine learning classifiers tend to exhibit a bias towards classes with larger sample sizes, resulting in the inadequate classification of asteroid families with limited data. In this paper, a two-step supervised model was proposed for the effective classification of the asteroid families, especially for the tiny, small, and lower groups of medium asteroid families. The proposed model uses two-step classification in an attempt to resolve the challenges that come with the imbalanced dataset where at first a binary classification of small and large families was done with an XGB (Extreme Gradient boosting) classifier and then in the second stage Random Forest classifier was used alongside previously identified binary features to classify asteroid families. The proposed model performed better with higher F1 scores for tiny and small asteroid families compared to other algorithms tested in this work. It also achieved a perfect F1 score for 90 families, among 112 families which were tested. As for the lower group of medium sized asteroid families, it performed slightly worse compared to the previously used machine learning algorithms. Along with this, four dataset imbalance handling techniques have been employed in this work and compared to the proposed algorithm.

查看原文本刊更多论文

用机器学习进行小行星族分类：对一种新的两步方法进行调查分析，用于对已知的小行星族进行分类

术语“小行星族”是指具有相似的轨道元素，如半长轴、偏心率和轨道倾角的小行星的集合。长期以来，探测小行星家族一直是一项挑战，因为它们的样本量极低。一般来说，独立的机器学习分类器倾向于对样本量较大的类表现出偏见，导致对数据有限的小行星族进行不充分的分类。本文提出了一种两步监督分类模型，用于对小行星族进行有效分类，特别是对中型小行星族中微小、较小和较低的群体进行有效分类。提出的模型使用两步分类来解决不平衡数据集带来的挑战，其中首先使用XGB（极端梯度增强）分类器对小型和大型家族进行二元分类，然后在第二阶段使用随机森林分类器与先前确定的二元特征一起对小行星家族进行分类。与本工作中测试的其他算法相比，所提出的模型在微小和小型小行星家族中表现更好，F1分数更高。在接受测试的112个家庭中，90个家庭获得了完美的F1分。至于中等大小的小行星家族，与之前使用的机器学习算法相比，它的表现略差。与此同时，本研究采用了四种数据集不平衡处理技术，并与所提出的算法进行了比较。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Experimental Astronomy 地学天文-天文与天体物理

CiteScore

5.30

自引率

3.30%

发文量

审稿时长

6-12 weeks

期刊介绍： Many new instruments for observing astronomical objects at a variety of wavelengths have been and are continually being developed. Furthermore, a vast amount of effort is being put into the development of new techniques for data analysis in order to cope with great streams of data collected by these instruments. Experimental Astronomy acts as a medium for the publication of papers of contemporary scientific interest on astrophysical instrumentation and methods necessary for the conduct of astronomy at all wavelength fields. Experimental Astronomy publishes full-length articles, research letters and reviews on developments in detection techniques, instruments, and data analysis and image processing techniques. Occasional special issues are published, giving an in-depth presentation of the instrumentation and/or analysis connected with specific projects, such as satellite experiments or ground-based telescopes, or of specialized techniques.