The Photographic Pipeline of Machine Vision; or, Machine Vision's Latent Photographic Theory

Nicolas Malevé, Katrina Sluis
{"title":"The Photographic Pipeline of Machine Vision; or, Machine Vision's Latent Photographic Theory","authors":"Nicolas Malevé, Katrina Sluis","doi":"10.1215/2834703x-10734066","DOIUrl":null,"url":null,"abstract":"Abstract Despite computer vision's extensive mobilization of cameras, photographers, and viewing subjects, photography's place in machine vision remains undertheorized. This article illuminates an operative theory of photography that exists in a latent form, embedded in the tools, practices, and discourses of machine vision research and enabling the methodological imperatives of dataset production. Focusing on the development of the canonical object recognition dataset ImageNet, the article analyzes how the dataset pipeline translates the radical polysemy of the photographic image into a stable and transparent form of data that can be portrayed as a proxy of human vision. Reflecting on the prominence of the photographic snapshot in machine vision discourse, the article traces the path that made this popular cultural practice amenable to the dataset. Following the evolution from nineteenth-century scientific photography to the acquisition of massive sets of online photos, the article shows how dataset creators inherit and transform a form of “instrumental realism,” a photographic enterprise that aims to establish a generalized look from contingent instances in the pursuit of statistical truth. The article concludes with a reflection on how the latent photographic theory of machine vision we have advanced relates to the large image models built for generative AI today.","PeriodicalId":500906,"journal":{"name":"Critical AI","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Critical AI","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1215/2834703x-10734066","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Abstract Despite computer vision's extensive mobilization of cameras, photographers, and viewing subjects, photography's place in machine vision remains undertheorized. This article illuminates an operative theory of photography that exists in a latent form, embedded in the tools, practices, and discourses of machine vision research and enabling the methodological imperatives of dataset production. Focusing on the development of the canonical object recognition dataset ImageNet, the article analyzes how the dataset pipeline translates the radical polysemy of the photographic image into a stable and transparent form of data that can be portrayed as a proxy of human vision. Reflecting on the prominence of the photographic snapshot in machine vision discourse, the article traces the path that made this popular cultural practice amenable to the dataset. Following the evolution from nineteenth-century scientific photography to the acquisition of massive sets of online photos, the article shows how dataset creators inherit and transform a form of “instrumental realism,” a photographic enterprise that aims to establish a generalized look from contingent instances in the pursuit of statistical truth. The article concludes with a reflection on how the latent photographic theory of machine vision we have advanced relates to the large image models built for generative AI today.
机器视觉中的摄影流水线;或者机器视觉的潜在摄影理论
尽管计算机视觉广泛地调动了相机、摄影师和观看对象,但摄影在机器视觉中的地位仍然缺乏理论化。本文阐明了一种潜在形式存在的摄影操作理论,它嵌入在机器视觉研究的工具、实践和话语中,并使数据集生产的方法必要性成为可能。本文以标准对象识别数据集ImageNet的开发为重点,分析了数据集管道如何将摄影图像的多义性转化为稳定透明的数据形式,可以作为人类视觉的代理来描绘。考虑到照片快照在机器视觉话语中的突出地位,本文追溯了使这种流行文化实践适合数据集的路径。随着从19世纪科学摄影到获取大量在线照片的演变,文章展示了数据集创建者如何继承和转变一种形式的“工具现实主义”,这是一种摄影企业,旨在从偶然的实例中建立一种普遍的外观,以追求统计真理。文章最后反思了我们提出的机器视觉的潜在摄影理论如何与今天为生成式人工智能构建的大型图像模型相关联。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信