Pavan Kumar Bondalapati, Pengwei Hu, Shannon E Paylor, John Zhang
{"title":"Towards Automating Search and Classification of Protostellar Images","authors":"Pavan Kumar Bondalapati, Pengwei Hu, Shannon E Paylor, John Zhang","doi":"10.1109/SIEDS52267.2021.9483748","DOIUrl":null,"url":null,"abstract":"Research on the origins of planets and life centers around protoplanetary disks and protostars, for which the Atacama Large Millimeter/sub-millimeter Array (ALMA) has been revolutionary due to its ability to capture high-resolution images with exceptional sensitivity. Astronomers study these birthplaces of planets and their properties, which determine the properties of any eventual planets. The ALMA science archive contains over a petabyte of astronomical data which has been collected by the ALMA telescope over the last decade. While the archive data is publicly available, manually searching through many thousands of unlabelled images and ascertaining the type and physical properties of celestial objects is immensely labor-intensive. For these reasons, an exhaustive manual search of the archive is unlikely to be comprehensive and creates the potential for astronomers to miss objects that were not the primary target of the telescope observational program. We develop a Python package to automate the noise filtration process, identify astronomical objects within a single image, and fit bivariate Gaussians to each detection. We apply an unsupervised learning algorithm to identify many apparently different protostellar disk images in a curated ALMA data set. Using this model and the residuals from a bivariate Gaussian fit, we can flag images of an unusual nature (e.g. spiral, ring, or other structure that does not adhere to a bivariate Gaussian shape) for manual review by astronomers, allowing them to examine a small subset of interesting images without sifting through the entire archive. Our open-source package is intended to assist astronomers in making new scientific discoveries by eliminating a labor-intensive bottleneck in their research.","PeriodicalId":426747,"journal":{"name":"2021 Systems and Information Engineering Design Symposium (SIEDS)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 Systems and Information Engineering Design Symposium (SIEDS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SIEDS52267.2021.9483748","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Research on the origins of planets and life centers around protoplanetary disks and protostars, for which the Atacama Large Millimeter/sub-millimeter Array (ALMA) has been revolutionary due to its ability to capture high-resolution images with exceptional sensitivity. Astronomers study these birthplaces of planets and their properties, which determine the properties of any eventual planets. The ALMA science archive contains over a petabyte of astronomical data which has been collected by the ALMA telescope over the last decade. While the archive data is publicly available, manually searching through many thousands of unlabelled images and ascertaining the type and physical properties of celestial objects is immensely labor-intensive. For these reasons, an exhaustive manual search of the archive is unlikely to be comprehensive and creates the potential for astronomers to miss objects that were not the primary target of the telescope observational program. We develop a Python package to automate the noise filtration process, identify astronomical objects within a single image, and fit bivariate Gaussians to each detection. We apply an unsupervised learning algorithm to identify many apparently different protostellar disk images in a curated ALMA data set. Using this model and the residuals from a bivariate Gaussian fit, we can flag images of an unusual nature (e.g. spiral, ring, or other structure that does not adhere to a bivariate Gaussian shape) for manual review by astronomers, allowing them to examine a small subset of interesting images without sifting through the entire archive. Our open-source package is intended to assist astronomers in making new scientific discoveries by eliminating a labor-intensive bottleneck in their research.