Ghosts in the Data: The Contested Politics of Absence in Data Infrastructures

IF 2.7 2区 社会学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
Will Orr
{"title":"Ghosts in the Data: The Contested Politics of Absence in Data Infrastructures","authors":"Will Orr","doi":"10.1177/08944393251365277","DOIUrl":null,"url":null,"abstract":"Absences are inescapable in data. Data collection always focuses on some elements while occluding others. Yet, how absences are considered and recorded within data infrastructures markedly transforms the inferences that can be made. Tracing a genealogy from early databases to contemporary AI datasets, this paper explores how data infrastructures have grappled with the inherent incompleteness of data. Specifically, I uncover a tension between a desire for certainty and acknowledging partiality at the foundation of data science that continues to pervade contemporary AI datasets. Drawing on archival studies and sociological perspectives, I argue that data science must embrace uncertainty by recognizing the “ghosts in the data”—the uncounted, the unrepresented, and the silenced—and how their absence shapes the outcomes of automated systems.","PeriodicalId":49509,"journal":{"name":"Social Science Computer Review","volume":"26 1","pages":""},"PeriodicalIF":2.7000,"publicationDate":"2025-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Social Science Computer Review","FirstCategoryId":"90","ListUrlMain":"https://doi.org/10.1177/08944393251365277","RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

Abstract

Absences are inescapable in data. Data collection always focuses on some elements while occluding others. Yet, how absences are considered and recorded within data infrastructures markedly transforms the inferences that can be made. Tracing a genealogy from early databases to contemporary AI datasets, this paper explores how data infrastructures have grappled with the inherent incompleteness of data. Specifically, I uncover a tension between a desire for certainty and acknowledging partiality at the foundation of data science that continues to pervade contemporary AI datasets. Drawing on archival studies and sociological perspectives, I argue that data science must embrace uncertainty by recognizing the “ghosts in the data”—the uncounted, the unrepresented, and the silenced—and how their absence shapes the outcomes of automated systems.
数据中的幽灵:数据基础设施缺失的争议政治
数据中的缺失是不可避免的。数据收集总是集中于某些元素而忽略了其他元素。然而,在数据基础设施中考虑和记录缺失的方式显著地改变了可以做出的推断。从早期数据库到当代人工智能数据集,本文探讨了数据基础设施如何应对数据固有的不完整性。具体来说,我发现了在数据科学基础上对确定性的渴望和承认偏见之间的紧张关系,这种紧张关系继续弥漫在当代人工智能数据集中。根据档案研究和社会学的观点,我认为数据科学必须通过认识到“数据中的幽灵”——未计数的、未代表的和沉默的——以及它们的缺失如何影响自动化系统的结果,来拥抱不确定性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Social Science Computer Review
Social Science Computer Review 社会科学-计算机:跨学科应用
CiteScore
9.00
自引率
4.90%
发文量
95
审稿时长
>12 weeks
期刊介绍: Unique Scope Social Science Computer Review is an interdisciplinary journal covering social science instructional and research applications of computing, as well as societal impacts of informational technology. Topics included: artificial intelligence, business, computational social science theory, computer-assisted survey research, computer-based qualitative analysis, computer simulation, economic modeling, electronic modeling, electronic publishing, geographic information systems, instrumentation and research tools, public administration, social impacts of computing and telecommunications, software evaluation, world-wide web resources for social scientists. Interdisciplinary Nature Because the Uses and impacts of computing are interdisciplinary, so is Social Science Computer Review. The journal is of direct relevance to scholars and scientists in a wide variety of disciplines. In its pages you''ll find work in the following areas: sociology, anthropology, political science, economics, psychology, computer literacy, computer applications, and methodology.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信