Carlos Medel-Vera, Pelayo Vidal-Estévez, Thomas Mädler
{"title":"A convolutional neural network approach to classifying urban spaces using generative tools for data augmentation","authors":"Carlos Medel-Vera, Pelayo Vidal-Estévez, Thomas Mädler","doi":"10.1177/14780771231225697","DOIUrl":null,"url":null,"abstract":"This article discusses an application for classifying urban spaces using convolutional neural networks (CNNs). A seed dataset was initially generated composed of 630 photographs of urban spaces from the Adobe Stock repository. This dataset was topped up with images produced by two generative artificial intelligence (AI) engines, namely, Deep Dream Generator and Midjourney, making two additional augmented datasets, each composed of 2200 images. The training process was carried out using four well-known CNNs, namely, GoogLeNet, ResNet-18, ShuffleNet, and MobileNet-v2. The results show an increase of roughly 30% in the predicting capabilities in both augmented datasets when compared to the seed dataset. Furthermore, performance metrics are generally higher when using ResNet-18 which may suggest that this CNN architecture is more applicable to urban classification projects. Finally, although both generative AI engines have similar performance, Midjourney seems to slightly outperform Deep Dream Generator as a data augmentation engine for urban spaces.","PeriodicalId":45139,"journal":{"name":"International Journal of Architectural Computing","volume":"40 19","pages":""},"PeriodicalIF":1.6000,"publicationDate":"2024-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Architectural Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1177/14780771231225697","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
This article discusses an application for classifying urban spaces using convolutional neural networks (CNNs). A seed dataset was initially generated composed of 630 photographs of urban spaces from the Adobe Stock repository. This dataset was topped up with images produced by two generative artificial intelligence (AI) engines, namely, Deep Dream Generator and Midjourney, making two additional augmented datasets, each composed of 2200 images. The training process was carried out using four well-known CNNs, namely, GoogLeNet, ResNet-18, ShuffleNet, and MobileNet-v2. The results show an increase of roughly 30% in the predicting capabilities in both augmented datasets when compared to the seed dataset. Furthermore, performance metrics are generally higher when using ResNet-18 which may suggest that this CNN architecture is more applicable to urban classification projects. Finally, although both generative AI engines have similar performance, Midjourney seems to slightly outperform Deep Dream Generator as a data augmentation engine for urban spaces.