Generating floor plans with deep learning: A cross-validation assessment over different dataset sizes

IF 1.6 0 ARCHITECTURE
Ricardo C Rodrigues, R. Duarte
{"title":"Generating floor plans with deep learning: A cross-validation assessment over different dataset sizes","authors":"Ricardo C Rodrigues, R. Duarte","doi":"10.1177/14780771221120842","DOIUrl":null,"url":null,"abstract":"The advent of deep learning has enabled a series of opportunities; one of them is the ability to tackle subjective factors on the floor plan design and make predictions though spatial semantic maps. Nonetheless, the amount available of data grows exponentially on a daily basis, in this sense, this research seeks to investigate deep generative methods of floor plan design and its relationship between data volume, with training time, quality and diversity in the outputs; in other words, what is the amount of data required to rapidly train models that return optimal results. In our research, we used a variation of the Conditional Generative Adversarial Network algorithm, that is, Pix2pix, and a dataset of approximately 80 thousand images to train 10 models and evaluate their performance through a series of computational metrics. The results show that the potential of this data-driven method depends not only on the diversity of the training set but also on the linearity of the distribution; therefore, high-dimensional datasets did not achieve good results. It is also concluded that models trained on small sets of data (800 images) may return excellent results if given the correct training instructions (Hyperparameters), but the best baseline to this generative task is in the mid-term, using around 20 to 30 thousand images with a linear distribution. Finally, it is presented standard guidelines for dataset design, and the impact of data curation along the entire process.","PeriodicalId":45139,"journal":{"name":"International Journal of Architectural Computing","volume":"20 1","pages":"630 - 644"},"PeriodicalIF":1.6000,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Architectural Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1177/14780771221120842","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"ARCHITECTURE","Score":null,"Total":0}
引用次数: 1

Abstract

The advent of deep learning has enabled a series of opportunities; one of them is the ability to tackle subjective factors on the floor plan design and make predictions though spatial semantic maps. Nonetheless, the amount available of data grows exponentially on a daily basis, in this sense, this research seeks to investigate deep generative methods of floor plan design and its relationship between data volume, with training time, quality and diversity in the outputs; in other words, what is the amount of data required to rapidly train models that return optimal results. In our research, we used a variation of the Conditional Generative Adversarial Network algorithm, that is, Pix2pix, and a dataset of approximately 80 thousand images to train 10 models and evaluate their performance through a series of computational metrics. The results show that the potential of this data-driven method depends not only on the diversity of the training set but also on the linearity of the distribution; therefore, high-dimensional datasets did not achieve good results. It is also concluded that models trained on small sets of data (800 images) may return excellent results if given the correct training instructions (Hyperparameters), but the best baseline to this generative task is in the mid-term, using around 20 to 30 thousand images with a linear distribution. Finally, it is presented standard guidelines for dataset design, and the impact of data curation along the entire process.
使用深度学习生成平面图:不同数据集大小的交叉验证评估
深度学习的出现带来了一系列的机会;其中之一是解决平面设计中的主观因素,并通过空间语义图进行预测的能力。尽管如此,可获得的数据量每天都呈指数级增长,从这个意义上说,本研究旨在探讨平面图设计的深层生成方法及其数据量与训练时间、质量和输出多样性之间的关系;换句话说,需要多少数据量才能快速训练返回最佳结果的模型?在我们的研究中,我们使用了条件生成对抗网络算法的一种变体,即Pix2pix,以及大约8万张图像的数据集来训练10个模型,并通过一系列计算指标评估它们的性能。结果表明,这种数据驱动方法的潜力不仅取决于训练集的多样性,还取决于分布的线性度;因此,高维数据集并没有取得很好的效果。研究还得出结论,如果给予正确的训练指令(超参数),在小数据集(800张图像)上训练的模型可能会返回出色的结果,但这个生成任务的最佳基线是在中期,使用大约2万到3万张线性分布的图像。最后,提出了数据集设计的标准指南,以及整个过程中数据管理的影响。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
3.20
自引率
17.60%
发文量
44
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信