Emily E. Berkson, Jared D. VanCor, Steven Esposito, Gary Chern, M. D. Pritt
{"title":"Synthetic Data Generation to Mitigate the Low/No-Shot Problem in Machine Learning","authors":"Emily E. Berkson, Jared D. VanCor, Steven Esposito, Gary Chern, M. D. Pritt","doi":"10.1109/AIPR47015.2019.9174596","DOIUrl":null,"url":null,"abstract":"The low/no-shot problem refers to a lack of available data for training deep learning algorithms. In remote sensing, complete image data sets are rare and do not always include the targets of interest. We propose a method to rapidly generate highfidelity synthetic satellite imagery featuring targets of interest over a range of solar illuminations and platform geometries. Specifically, we used the Digital Imaging and Remote Sensing Image Generation model and a custom image simulator to produce synthetic imagery of C130 aircraft in place of real Worldview-3 imagery. Our synthetic imagery was supplemented with real Worldview-3 images to test the efficacy of training deep learning algorithms with synthetic data. We deliberately chose a challenging test case of distinguishing C130s from other aircraft, or neither. Results show a negligible improvement in automatic target classification when synthetic data is supplemented with a small amount of real imagery. However, training with synthetic data alone only achieves F1-scores in line with a random classifier, suggesting that there is still significant domain mismatch between the real and synthetic datasets.","PeriodicalId":167075,"journal":{"name":"2019 IEEE Applied Imagery Pattern Recognition Workshop (AIPR)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE Applied Imagery Pattern Recognition Workshop (AIPR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AIPR47015.2019.9174596","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
The low/no-shot problem refers to a lack of available data for training deep learning algorithms. In remote sensing, complete image data sets are rare and do not always include the targets of interest. We propose a method to rapidly generate highfidelity synthetic satellite imagery featuring targets of interest over a range of solar illuminations and platform geometries. Specifically, we used the Digital Imaging and Remote Sensing Image Generation model and a custom image simulator to produce synthetic imagery of C130 aircraft in place of real Worldview-3 imagery. Our synthetic imagery was supplemented with real Worldview-3 images to test the efficacy of training deep learning algorithms with synthetic data. We deliberately chose a challenging test case of distinguishing C130s from other aircraft, or neither. Results show a negligible improvement in automatic target classification when synthetic data is supplemented with a small amount of real imagery. However, training with synthetic data alone only achieves F1-scores in line with a random classifier, suggesting that there is still significant domain mismatch between the real and synthetic datasets.