Yi Zi You, YunRong Pan, Zhi Ma, Li Zhang, Shuo Xiao, Dan Dan Zhang, Shijun Dang, Shuang Ru Zhao, Pei Wang, Ai-Jun Dong, Jiatao Jiang, Jibing Leng, Weian Li, Siyao Li
{"title":"混合聚类在FAST多模态脉冲星筛选中的应用","authors":"Yi Zi You, YunRong Pan, Zhi Ma, Li Zhang, Shuo Xiao, Dan Dan Zhang, Shijun Dang, Shuang Ru Zhao, Pei Wang, Ai-Jun Dong, Jiatao Jiang, Jibing Leng, Weian Li, Siyao Li","doi":"10.1088/1674-4527/ad0c28","DOIUrl":null,"url":null,"abstract":"Abstract Pulsar search is always the basis of pulsar navigation, gravitational wave detection and other research topics. Currently, the volume of pulsar candidates collected by Five-hundred-meter Aperture Spherical radio Telescope (FAST) shows an explosive growth rate that has brought challenges for its pulsar candidate filtering System. Particularly, the multi-view heterogeneous data and class imbalance between true pulsars and non-pulsar candidates have negative effects on traditional single-modal supervised classification methods. In this study, a multi-modal and semi-supervised learning based pulsar candidate sifting algorithm is presented, which adopts a hybrid ensemble clustering scheme of density-based and partition-based methods combined with a feature-level fusion strategy for input data and a data partition strategy for parallelization. Experiments on both HTRU (The High Time Resolution Universe Survey) 2 and FAST actual observation data demonstrate that the proposed algorithm could excellently identify the pulsars: On HTRU2, the precision and recall rates of its parallel mode reach 0.981 and 0.988. On FAST data, those of its parallel mode reach 0.891 and 0.961, meanwhile, the running time also significantly decrease with the increment of parallel nodes within limits. So, we can get the conclusion that our algorithm could be a feasible idea for large scale pulsar candidate sifting of FAST drift scan observation.","PeriodicalId":54494,"journal":{"name":"Research in Astronomy and Astrophysics","volume":"58 13","pages":"0"},"PeriodicalIF":1.8000,"publicationDate":"2023-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Applying hybrid clustering in pulsar candidate sifting with multi-modality for FAST survey\",\"authors\":\"Yi Zi You, YunRong Pan, Zhi Ma, Li Zhang, Shuo Xiao, Dan Dan Zhang, Shijun Dang, Shuang Ru Zhao, Pei Wang, Ai-Jun Dong, Jiatao Jiang, Jibing Leng, Weian Li, Siyao Li\",\"doi\":\"10.1088/1674-4527/ad0c28\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract Pulsar search is always the basis of pulsar navigation, gravitational wave detection and other research topics. Currently, the volume of pulsar candidates collected by Five-hundred-meter Aperture Spherical radio Telescope (FAST) shows an explosive growth rate that has brought challenges for its pulsar candidate filtering System. Particularly, the multi-view heterogeneous data and class imbalance between true pulsars and non-pulsar candidates have negative effects on traditional single-modal supervised classification methods. In this study, a multi-modal and semi-supervised learning based pulsar candidate sifting algorithm is presented, which adopts a hybrid ensemble clustering scheme of density-based and partition-based methods combined with a feature-level fusion strategy for input data and a data partition strategy for parallelization. Experiments on both HTRU (The High Time Resolution Universe Survey) 2 and FAST actual observation data demonstrate that the proposed algorithm could excellently identify the pulsars: On HTRU2, the precision and recall rates of its parallel mode reach 0.981 and 0.988. On FAST data, those of its parallel mode reach 0.891 and 0.961, meanwhile, the running time also significantly decrease with the increment of parallel nodes within limits. So, we can get the conclusion that our algorithm could be a feasible idea for large scale pulsar candidate sifting of FAST drift scan observation.\",\"PeriodicalId\":54494,\"journal\":{\"name\":\"Research in Astronomy and Astrophysics\",\"volume\":\"58 13\",\"pages\":\"0\"},\"PeriodicalIF\":1.8000,\"publicationDate\":\"2023-11-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Research in Astronomy and Astrophysics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1088/1674-4527/ad0c28\",\"RegionNum\":4,\"RegionCategory\":\"物理与天体物理\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"ASTRONOMY & ASTROPHYSICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Research in Astronomy and Astrophysics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1088/1674-4527/ad0c28","RegionNum":4,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ASTRONOMY & ASTROPHYSICS","Score":null,"Total":0}
引用次数: 0
摘要
脉冲星搜索一直是脉冲星导航、引力波探测等研究课题的基础。目前,500米口径球面射电望远镜(FAST)采集的候选脉冲星数量呈爆发式增长,这给其候选脉冲星过滤系统带来了挑战。特别是,真实脉冲星和非脉冲星候选星之间的多视点异构数据和类别不平衡对传统的单模态监督分类方法产生了不利影响。本文提出了一种基于多模态半监督学习的脉冲星候选筛选算法,该算法采用基于密度和基于分区的混合集成聚类方案,并结合输入数据的特征级融合策略和并行化的数据分区策略。在HTRU2 (High Time Resolution Universe Survey)和FAST实际观测数据上的实验表明,该算法能够很好地识别脉冲星:在HTRU2上,其并行模式的准确率和召回率分别达到0.981和0.988。在FAST数据上,其并行模式的运行时间分别达到0.891和0.961,同时在一定范围内,随着并行节点的增加,运行时间也显著减少。由此可以得出结论,该算法对于FAST漂移扫描观测的大规模候选脉冲星筛选是一种可行的思路。
Applying hybrid clustering in pulsar candidate sifting with multi-modality for FAST survey
Abstract Pulsar search is always the basis of pulsar navigation, gravitational wave detection and other research topics. Currently, the volume of pulsar candidates collected by Five-hundred-meter Aperture Spherical radio Telescope (FAST) shows an explosive growth rate that has brought challenges for its pulsar candidate filtering System. Particularly, the multi-view heterogeneous data and class imbalance between true pulsars and non-pulsar candidates have negative effects on traditional single-modal supervised classification methods. In this study, a multi-modal and semi-supervised learning based pulsar candidate sifting algorithm is presented, which adopts a hybrid ensemble clustering scheme of density-based and partition-based methods combined with a feature-level fusion strategy for input data and a data partition strategy for parallelization. Experiments on both HTRU (The High Time Resolution Universe Survey) 2 and FAST actual observation data demonstrate that the proposed algorithm could excellently identify the pulsars: On HTRU2, the precision and recall rates of its parallel mode reach 0.981 and 0.988. On FAST data, those of its parallel mode reach 0.891 and 0.961, meanwhile, the running time also significantly decrease with the increment of parallel nodes within limits. So, we can get the conclusion that our algorithm could be a feasible idea for large scale pulsar candidate sifting of FAST drift scan observation.
期刊介绍:
Research in Astronomy and Astrophysics (RAA) is an international journal publishing original research papers and reviews across all branches of astronomy and astrophysics, with a particular interest in the following topics:
-large-scale structure of universe formation and evolution of galaxies-
high-energy and cataclysmic processes in astrophysics-
formation and evolution of stars-
astrogeodynamics-
solar magnetic activity and heliogeospace environments-
dynamics of celestial bodies in the solar system and artificial bodies-
space observation and exploration-
new astronomical techniques and methods