{"title":"单帧红外图像中的点源目标检测与定位","authors":"Daniel C. Stumpp, Andrew J. Byrne, Alan D. George","doi":"10.1109/AERO55745.2023.10115662","DOIUrl":null,"url":null,"abstract":"Potential military threats often manifest as dim point-source targets embedded in complex clutter and noise back-grounds, which makes threat detection a significant challenge. A variety of machine-learning architectures have been developed in recent years for performing small-object segmentation in single frames of infrared imagery. Evaluation and comparison of these techniques has been hampered by a lack of reliably labeled data and the use of different evaluation metrics. In this research, we leverage the Air Force Institute of Technology Sensor and Scene Emulation Tool (ASSET) to generate a dataset containing independent frames with unique background and target characteristics. We introduce a standardized method for generating ground-truth segmentation masks for point-source targets that eliminates the risk of manual labeling errors that exist in other small-target segmentation datasets. A local peak signal-to-clutter-and-noise ratio (pSCNR) is also introduced and shown to be strongly correlated to probability of detection. Results show that with the use of the generated dataset, existing state-of-the-art small-object segmentation networks can be adapted specifically to the point-source target detection task. A probability of detection (Pd) greater than 80% is consistently achieved while maintaining low false alarm rates. In addition to the task of target detection, we address the problem of target subpixel localization in a single frame. Accurate subpixel localization is important due to the large physical area included in a single pixel. Existing work commonly overlooks this problem or takes the predicted target mask centroid as the subpixel location. In this research, we introduce a transformer-based subpixel localization technique that uses both the predicted target mask and the local pixel intensity to compute an accurate subpixel location. The proposed architecture reduces mean localization error by up to 72% compared to other single-frame methods for target subpixel localization.","PeriodicalId":344285,"journal":{"name":"2023 IEEE Aerospace Conference","volume":"297 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Point-Source Target Detection and Localization in Single-Frame Infrared Imagery\",\"authors\":\"Daniel C. Stumpp, Andrew J. Byrne, Alan D. George\",\"doi\":\"10.1109/AERO55745.2023.10115662\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Potential military threats often manifest as dim point-source targets embedded in complex clutter and noise back-grounds, which makes threat detection a significant challenge. A variety of machine-learning architectures have been developed in recent years for performing small-object segmentation in single frames of infrared imagery. Evaluation and comparison of these techniques has been hampered by a lack of reliably labeled data and the use of different evaluation metrics. In this research, we leverage the Air Force Institute of Technology Sensor and Scene Emulation Tool (ASSET) to generate a dataset containing independent frames with unique background and target characteristics. We introduce a standardized method for generating ground-truth segmentation masks for point-source targets that eliminates the risk of manual labeling errors that exist in other small-target segmentation datasets. A local peak signal-to-clutter-and-noise ratio (pSCNR) is also introduced and shown to be strongly correlated to probability of detection. Results show that with the use of the generated dataset, existing state-of-the-art small-object segmentation networks can be adapted specifically to the point-source target detection task. A probability of detection (Pd) greater than 80% is consistently achieved while maintaining low false alarm rates. In addition to the task of target detection, we address the problem of target subpixel localization in a single frame. Accurate subpixel localization is important due to the large physical area included in a single pixel. Existing work commonly overlooks this problem or takes the predicted target mask centroid as the subpixel location. In this research, we introduce a transformer-based subpixel localization technique that uses both the predicted target mask and the local pixel intensity to compute an accurate subpixel location. The proposed architecture reduces mean localization error by up to 72% compared to other single-frame methods for target subpixel localization.\",\"PeriodicalId\":344285,\"journal\":{\"name\":\"2023 IEEE Aerospace Conference\",\"volume\":\"297 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-03-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE Aerospace Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/AERO55745.2023.10115662\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE Aerospace Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AERO55745.2023.10115662","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Point-Source Target Detection and Localization in Single-Frame Infrared Imagery
Potential military threats often manifest as dim point-source targets embedded in complex clutter and noise back-grounds, which makes threat detection a significant challenge. A variety of machine-learning architectures have been developed in recent years for performing small-object segmentation in single frames of infrared imagery. Evaluation and comparison of these techniques has been hampered by a lack of reliably labeled data and the use of different evaluation metrics. In this research, we leverage the Air Force Institute of Technology Sensor and Scene Emulation Tool (ASSET) to generate a dataset containing independent frames with unique background and target characteristics. We introduce a standardized method for generating ground-truth segmentation masks for point-source targets that eliminates the risk of manual labeling errors that exist in other small-target segmentation datasets. A local peak signal-to-clutter-and-noise ratio (pSCNR) is also introduced and shown to be strongly correlated to probability of detection. Results show that with the use of the generated dataset, existing state-of-the-art small-object segmentation networks can be adapted specifically to the point-source target detection task. A probability of detection (Pd) greater than 80% is consistently achieved while maintaining low false alarm rates. In addition to the task of target detection, we address the problem of target subpixel localization in a single frame. Accurate subpixel localization is important due to the large physical area included in a single pixel. Existing work commonly overlooks this problem or takes the predicted target mask centroid as the subpixel location. In this research, we introduce a transformer-based subpixel localization technique that uses both the predicted target mask and the local pixel intensity to compute an accurate subpixel location. The proposed architecture reduces mean localization error by up to 72% compared to other single-frame methods for target subpixel localization.