{"title":"基于深度学习的植物识别系统的数据增强策略和启发效应:一个案例研究","authors":"Luciano Araújo Dourado Filho, R. Calumby","doi":"10.5335/rbca.v14i2.13487","DOIUrl":null,"url":null,"abstract":"Data augmentation (DA) is a widely known strategy for effectiveness improvement in computer vision models such as Deep Convolutional Neural Networks (DCNN). Although it enables improving model generalization by increasing data diversity, in this work we propose to investigate its effects with respect to two different sources of dataset imbalance (i.e., Content and Sampling imbalance) in a plant species recognition task. We systematically evaluated several techniques to generate the augmented datasets used to train the DCNN models that enabled a thorough investigation over the effects of DA in terms of imbalance attenuation. The results allowed inferring that data augmentation enables mitigating the negative effects related to underrepresentation mainly caused by the dataset imbalance.","PeriodicalId":41711,"journal":{"name":"Revista Brasileira de Computacao Aplicada","volume":null,"pages":null},"PeriodicalIF":0.2000,"publicationDate":"2022-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Data Augmentation policies and heuristics effects over dataset imbalance for developing plant identification systems based on Deep Learning: A case study.\",\"authors\":\"Luciano Araújo Dourado Filho, R. Calumby\",\"doi\":\"10.5335/rbca.v14i2.13487\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Data augmentation (DA) is a widely known strategy for effectiveness improvement in computer vision models such as Deep Convolutional Neural Networks (DCNN). Although it enables improving model generalization by increasing data diversity, in this work we propose to investigate its effects with respect to two different sources of dataset imbalance (i.e., Content and Sampling imbalance) in a plant species recognition task. We systematically evaluated several techniques to generate the augmented datasets used to train the DCNN models that enabled a thorough investigation over the effects of DA in terms of imbalance attenuation. The results allowed inferring that data augmentation enables mitigating the negative effects related to underrepresentation mainly caused by the dataset imbalance.\",\"PeriodicalId\":41711,\"journal\":{\"name\":\"Revista Brasileira de Computacao Aplicada\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.2000,\"publicationDate\":\"2022-06-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Revista Brasileira de Computacao Aplicada\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5335/rbca.v14i2.13487\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Revista Brasileira de Computacao Aplicada","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5335/rbca.v14i2.13487","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
Data Augmentation policies and heuristics effects over dataset imbalance for developing plant identification systems based on Deep Learning: A case study.
Data augmentation (DA) is a widely known strategy for effectiveness improvement in computer vision models such as Deep Convolutional Neural Networks (DCNN). Although it enables improving model generalization by increasing data diversity, in this work we propose to investigate its effects with respect to two different sources of dataset imbalance (i.e., Content and Sampling imbalance) in a plant species recognition task. We systematically evaluated several techniques to generate the augmented datasets used to train the DCNN models that enabled a thorough investigation over the effects of DA in terms of imbalance attenuation. The results allowed inferring that data augmentation enables mitigating the negative effects related to underrepresentation mainly caused by the dataset imbalance.