Sogol Sanjaripour, Archana Aravindan, Gabriela Canalizo, Shoubaneh Hemmati, Bahram Mobasher, Alison L. Coil and Barry C. Barish
{"title":"Selection of Dwarf Galaxies Hosting Active Galactic Nuclei: A Measure of Bias and Contamination Using Unsupervised Machine Learning Techniques","authors":"Sogol Sanjaripour, Archana Aravindan, Gabriela Canalizo, Shoubaneh Hemmati, Bahram Mobasher, Alison L. Coil and Barry C. Barish","doi":"10.3847/1538-4357/ae0326","DOIUrl":null,"url":null,"abstract":"Identifying active galactic nuclei (AGNs) in dwarf galaxies is critical for understanding black hole formation but remains observationally challenging due to their low luminosities, metallicities, and star formation–driven emission that can obscure AGN signatures. Machine learning techniques, particularly unsupervised methods, offer new ways to address these challenges by uncovering patterns in complex, multiwavelength data. In this study, we apply Self-Organizing Maps (SOMs) to explore the spectral energy distribution (SED) manifold of dwarf galaxies and evaluate AGN selection biases across various diagnostics. We train a 51 × 51 SOM on 30,344 dwarf galaxies (z < 0.055, M* < 109.5M⊙) from the NSA catalog using nine-band photometry spanning near-UV to mid-infrared. A set of 438 previously identified dwarf AGNs, selected via mid-infrared color, optical emission lines, X-ray, optical variability, and broad-line features, was mapped onto the SOM. AGNs identified by different methods occupy distinct and partially overlapping regions in SED space, reflecting biases related to host galaxy properties. Wide-field Infrared Survey Explorer (WISE)-selected AGNs are strongly concentrated in lower-mass regions and form two distinct clumps: one associated with bluer, starburst-like systems and the other with redder galaxies showing spectral features more typical of AGNs. This separation may help disentangle true AGN hosts from starburst contaminants in WISE-selected samples. Additionally, AGNs selected via various diagnostics tend to avoid regions of strong star formation, while a subset of lower-mass AGNs occupy SOM regions indicative of high AGN luminosity relative to their stellar content. Our results demonstrate the utility of manifold learning in refining AGN selection in the low-mass regime.","PeriodicalId":501813,"journal":{"name":"The Astrophysical Journal","volume":"106 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Astrophysical Journal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3847/1538-4357/ae0326","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Identifying active galactic nuclei (AGNs) in dwarf galaxies is critical for understanding black hole formation but remains observationally challenging due to their low luminosities, metallicities, and star formation–driven emission that can obscure AGN signatures. Machine learning techniques, particularly unsupervised methods, offer new ways to address these challenges by uncovering patterns in complex, multiwavelength data. In this study, we apply Self-Organizing Maps (SOMs) to explore the spectral energy distribution (SED) manifold of dwarf galaxies and evaluate AGN selection biases across various diagnostics. We train a 51 × 51 SOM on 30,344 dwarf galaxies (z < 0.055, M* < 109.5M⊙) from the NSA catalog using nine-band photometry spanning near-UV to mid-infrared. A set of 438 previously identified dwarf AGNs, selected via mid-infrared color, optical emission lines, X-ray, optical variability, and broad-line features, was mapped onto the SOM. AGNs identified by different methods occupy distinct and partially overlapping regions in SED space, reflecting biases related to host galaxy properties. Wide-field Infrared Survey Explorer (WISE)-selected AGNs are strongly concentrated in lower-mass regions and form two distinct clumps: one associated with bluer, starburst-like systems and the other with redder galaxies showing spectral features more typical of AGNs. This separation may help disentangle true AGN hosts from starburst contaminants in WISE-selected samples. Additionally, AGNs selected via various diagnostics tend to avoid regions of strong star formation, while a subset of lower-mass AGNs occupy SOM regions indicative of high AGN luminosity relative to their stellar content. Our results demonstrate the utility of manifold learning in refining AGN selection in the low-mass regime.