Asteroid family classification with machine learning: Investigative analysis of a novel two-step approach for categorizing known small asteroid families⋆
Fatin Abrar Shams, Abdullah Al Mahmud Nafiz, Md. Salman Mohosheu, Maheen Mashrur Hoque, Samiur Rashid Abir, Rashed Hasan Ratul, Md. Mushfiqur Rahman Mushfique, Aftab Ibn Nazim, Rubaiat Rehman Khan, Md Mahmudunnobe, Mohsinul Kabir
{"title":"Asteroid family classification with machine learning: Investigative analysis of a novel two-step approach for categorizing known small asteroid families⋆","authors":"Fatin Abrar Shams, Abdullah Al Mahmud Nafiz, Md. Salman Mohosheu, Maheen Mashrur Hoque, Samiur Rashid Abir, Rashed Hasan Ratul, Md. Mushfiqur Rahman Mushfique, Aftab Ibn Nazim, Rubaiat Rehman Khan, Md Mahmudunnobe, Mohsinul Kabir","doi":"10.1007/s10686-025-09982-y","DOIUrl":null,"url":null,"abstract":"<div><p>The term “asteroid family” refers to a collection of asteroids that share similar proper orbital elements such as semi-major axis, eccentricities, and orbital inclinations. Detecting small asteroid families has proved to be a challenge for a long time because of their extremely low sample size. In general, standalone machine learning classifiers tend to exhibit a bias towards classes with larger sample sizes, resulting in the inadequate classification of asteroid families with limited data. In this paper, a two-step supervised model was proposed for the effective classification of the asteroid families, especially for the tiny, small, and lower groups of medium asteroid families. The proposed model uses two-step classification in an attempt to resolve the challenges that come with the imbalanced dataset where at first a binary classification of small and large families was done with an XGB (Extreme Gradient boosting) classifier and then in the second stage Random Forest classifier was used alongside previously identified binary features to classify asteroid families. The proposed model performed better with higher F1 scores for tiny and small asteroid families compared to other algorithms tested in this work. It also achieved a perfect F1 score for 90 families, among 112 families which were tested. As for the lower group of medium sized asteroid families, it performed slightly worse compared to the previously used machine learning algorithms. Along with this, four dataset imbalance handling techniques have been employed in this work and compared to the proposed algorithm.</p></div>","PeriodicalId":551,"journal":{"name":"Experimental Astronomy","volume":"59 1","pages":""},"PeriodicalIF":2.7000,"publicationDate":"2025-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Experimental Astronomy","FirstCategoryId":"101","ListUrlMain":"https://link.springer.com/article/10.1007/s10686-025-09982-y","RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ASTRONOMY & ASTROPHYSICS","Score":null,"Total":0}
引用次数: 0
Abstract
The term “asteroid family” refers to a collection of asteroids that share similar proper orbital elements such as semi-major axis, eccentricities, and orbital inclinations. Detecting small asteroid families has proved to be a challenge for a long time because of their extremely low sample size. In general, standalone machine learning classifiers tend to exhibit a bias towards classes with larger sample sizes, resulting in the inadequate classification of asteroid families with limited data. In this paper, a two-step supervised model was proposed for the effective classification of the asteroid families, especially for the tiny, small, and lower groups of medium asteroid families. The proposed model uses two-step classification in an attempt to resolve the challenges that come with the imbalanced dataset where at first a binary classification of small and large families was done with an XGB (Extreme Gradient boosting) classifier and then in the second stage Random Forest classifier was used alongside previously identified binary features to classify asteroid families. The proposed model performed better with higher F1 scores for tiny and small asteroid families compared to other algorithms tested in this work. It also achieved a perfect F1 score for 90 families, among 112 families which were tested. As for the lower group of medium sized asteroid families, it performed slightly worse compared to the previously used machine learning algorithms. Along with this, four dataset imbalance handling techniques have been employed in this work and compared to the proposed algorithm.
期刊介绍:
Many new instruments for observing astronomical objects at a variety of wavelengths have been and are continually being developed. Furthermore, a vast amount of effort is being put into the development of new techniques for data analysis in order to cope with great streams of data collected by these instruments.
Experimental Astronomy acts as a medium for the publication of papers of contemporary scientific interest on astrophysical instrumentation and methods necessary for the conduct of astronomy at all wavelength fields.
Experimental Astronomy publishes full-length articles, research letters and reviews on developments in detection techniques, instruments, and data analysis and image processing techniques. Occasional special issues are published, giving an in-depth presentation of the instrumentation and/or analysis connected with specific projects, such as satellite experiments or ground-based telescopes, or of specialized techniques.