Clever little tricks: A socio-technical history of text-to-image generative models

IF 2.1 0 ARCHITECTURE

International Journal of Architectural Computing Pub Date : 2023-06-01 DOI:10.1177/14780771231168230

K. Steinfeld

{"title":"Clever little tricks: A socio-technical history of text-to-image generative models","authors":"K. Steinfeld","doi":"10.1177/14780771231168230","DOIUrl":null,"url":null,"abstract":"The emergence of text-to-image generative models (e.g., Midjourney, DALL-E 2, Stable Diffusion) in the summer of 2022 impacted architectural visual culture suddenly, severely, and seemingly out of nowhere. To contextualize this phenomenon, this text offers a socio-technical history of text-to-image generative systems. Three moments in time, or “scenes,” are presented here: the first at the advent of AI in the middle of the last century; the second at the “reawakening” of a specific approach to machine learning at the turn of this century; the third that documents a rapid sequence of innovations, dubbed “clever little tricks,” that occurred across just 18 months. This final scene is the crux, and represents the first formal documentation of the recent history of a specific set of informal innovations. These innovations were produced by non-affiliated researchers and communities of creative contributors, and directly led to the technologies that so compellingly captured the architectural imagination in the summer of 2022. Across these scenes, we examine the technologies, application domains, infrastructures, social contexts, and practices that drive technical research and shape creative practice in this space.","PeriodicalId":45139,"journal":{"name":"International Journal of Architectural Computing","volume":"21 1","pages":"211 - 241"},"PeriodicalIF":2.1000,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Architectural Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1177/14780771231168230","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

Abstract

The emergence of text-to-image generative models (e.g., Midjourney, DALL-E 2, Stable Diffusion) in the summer of 2022 impacted architectural visual culture suddenly, severely, and seemingly out of nowhere. To contextualize this phenomenon, this text offers a socio-technical history of text-to-image generative systems. Three moments in time, or “scenes,” are presented here: the first at the advent of AI in the middle of the last century; the second at the “reawakening” of a specific approach to machine learning at the turn of this century; the third that documents a rapid sequence of innovations, dubbed “clever little tricks,” that occurred across just 18 months. This final scene is the crux, and represents the first formal documentation of the recent history of a specific set of informal innovations. These innovations were produced by non-affiliated researchers and communities of creative contributors, and directly led to the technologies that so compellingly captured the architectural imagination in the summer of 2022. Across these scenes, we examine the technologies, application domains, infrastructures, social contexts, and practices that drive technical research and shape creative practice in this space.

查看原文本刊更多论文

聪明的小技巧:文本到图像生成模型的社会技术史

2022年夏天，文本到图像生成模型（例如，Midtravel、DALL-e 2、Stable Diffusion）的出现突然、严重、似乎突然地影响了建筑视觉文化。为了将这一现象置于背景中，本文提供了一个文本到图像生成系统的社会技术史。这里呈现了三个时刻，即“场景”：上世纪中叶人工智能出现时的第一个时刻；第二次是在本世纪之交“唤醒”了一种特定的机器学习方法；第三份报告记录了一系列被称为“聪明的小把戏”的快速创新，这些创新发生在短短18个月内。这最后一幕是关键，它代表了一系列特定非正式创新的近代史的第一份正式文件。这些创新是由非附属研究人员和创意贡献者社区产生的，并直接导致了在2022年夏天令人信服地捕捉到建筑想象力的技术。在这些场景中，我们研究了推动技术研究和塑造该领域创造性实践的技术、应用领域、基础设施、社会背景和实践。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International Journal of Architectural Computing ARCHITECTURE-

CiteScore

3.20

自引率

17.60%

发文量