Generating floor plans with deep learning: A cross-validation assessment over different dataset sizes

IF 1.6 0 ARCHITECTURE

International Journal of Architectural Computing Pub Date : 2022-09-01 DOI:10.1177/14780771221120842

Ricardo C Rodrigues, R. Duarte

{"title":"Generating floor plans with deep learning: A cross-validation assessment over different dataset sizes","authors":"Ricardo C Rodrigues, R. Duarte","doi":"10.1177/14780771221120842","DOIUrl":null,"url":null,"abstract":"The advent of deep learning has enabled a series of opportunities; one of them is the ability to tackle subjective factors on the floor plan design and make predictions though spatial semantic maps. Nonetheless, the amount available of data grows exponentially on a daily basis, in this sense, this research seeks to investigate deep generative methods of floor plan design and its relationship between data volume, with training time, quality and diversity in the outputs; in other words, what is the amount of data required to rapidly train models that return optimal results. In our research, we used a variation of the Conditional Generative Adversarial Network algorithm, that is, Pix2pix, and a dataset of approximately 80 thousand images to train 10 models and evaluate their performance through a series of computational metrics. The results show that the potential of this data-driven method depends not only on the diversity of the training set but also on the linearity of the distribution; therefore, high-dimensional datasets did not achieve good results. It is also concluded that models trained on small sets of data (800 images) may return excellent results if given the correct training instructions (Hyperparameters), but the best baseline to this generative task is in the mid-term, using around 20 to 30 thousand images with a linear distribution. Finally, it is presented standard guidelines for dataset design, and the impact of data curation along the entire process.","PeriodicalId":45139,"journal":{"name":"International Journal of Architectural Computing","volume":"20 1","pages":"630 - 644"},"PeriodicalIF":1.6000,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Architectural Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1177/14780771221120842","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"ARCHITECTURE","Score":null,"Total":0}

引用次数: 1

Abstract

The advent of deep learning has enabled a series of opportunities; one of them is the ability to tackle subjective factors on the floor plan design and make predictions though spatial semantic maps. Nonetheless, the amount available of data grows exponentially on a daily basis, in this sense, this research seeks to investigate deep generative methods of floor plan design and its relationship between data volume, with training time, quality and diversity in the outputs; in other words, what is the amount of data required to rapidly train models that return optimal results. In our research, we used a variation of the Conditional Generative Adversarial Network algorithm, that is, Pix2pix, and a dataset of approximately 80 thousand images to train 10 models and evaluate their performance through a series of computational metrics. The results show that the potential of this data-driven method depends not only on the diversity of the training set but also on the linearity of the distribution; therefore, high-dimensional datasets did not achieve good results. It is also concluded that models trained on small sets of data (800 images) may return excellent results if given the correct training instructions (Hyperparameters), but the best baseline to this generative task is in the mid-term, using around 20 to 30 thousand images with a linear distribution. Finally, it is presented standard guidelines for dataset design, and the impact of data curation along the entire process.

查看原文本刊更多论文

使用深度学习生成平面图：不同数据集大小的交叉验证评估

深度学习的出现带来了一系列的机会;其中之一是解决平面设计中的主观因素，并通过空间语义图进行预测的能力。尽管如此，可获得的数据量每天都呈指数级增长，从这个意义上说，本研究旨在探讨平面图设计的深层生成方法及其数据量与训练时间、质量和输出多样性之间的关系;换句话说，需要多少数据量才能快速训练返回最佳结果的模型?在我们的研究中，我们使用了条件生成对抗网络算法的一种变体，即Pix2pix，以及大约8万张图像的数据集来训练10个模型，并通过一系列计算指标评估它们的性能。结果表明，这种数据驱动方法的潜力不仅取决于训练集的多样性，还取决于分布的线性度;因此，高维数据集并没有取得很好的效果。研究还得出结论，如果给予正确的训练指令(超参数)，在小数据集(800张图像)上训练的模型可能会返回出色的结果，但这个生成任务的最佳基线是在中期，使用大约2万到3万张线性分布的图像。最后，提出了数据集设计的标准指南，以及整个过程中数据管理的影响。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International Journal of Architectural Computing ARCHITECTURE-

CiteScore

3.20

自引率

17.60%

发文量