Outlier bias: AI classification of curb ramps, outliers, and context

IF 5.9 1区社会学 Q1 SOCIAL SCIENCES, INTERDISCIPLINARY

Big Data & Society Pub Date : 2023-07-01 DOI:10.1177/20539517231203669

Shiloh Deitz

{"title":"Outlier bias: AI classification of curb ramps, outliers, and context","authors":"Shiloh Deitz","doi":"10.1177/20539517231203669","DOIUrl":null,"url":null,"abstract":"Technologies in the smart city, such as autonomous vehicles and delivery robots, promise to increase the mobility and freedom of people with disabilities. These technologies have also failed to “see” or comprehend wheelchair riders, people walking with service animals, and people walking with bicycles—all outliers to machine learning models. Big data and algorithms have been amply critiqued for their biases—harmful and systematic errors—but the harms that arise from AI's inherent inability to handle nuance, context, and exception have been largely overlooked. In this paper, I run two machine learning models across nine cities in the United States to attempt to fill a gap in data about the location of curb ramps. I find that while curb ramp prediction models may achieve up to 88% accuracy, the rate of accuracy varied in context in ways both predictable and unpredictable. I look closely at cases of unpredictable error (outlier bias), by triangulating with aerial and street view imagery. The sampling of cases shows that while it may be possible to conjecture about patterns in these errors, there is nothing clearly systematic. While more data and bigger models might improve the accuracy somewhat, I propose that a bias toward outliers is something fundamental to machine learning models which gravitate to the mean and require unbiased and not missing data. I conclude by arguing that universal design or design for the outliers is imperative for justice in the smart city where algorithms and data are increasingly embedded as infrastructure.","PeriodicalId":47834,"journal":{"name":"Big Data & Society","volume":"9 1","pages":"0"},"PeriodicalIF":5.9000,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Big Data & Society","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1177/20539517231203669","RegionNum":1,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"SOCIAL SCIENCES, INTERDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

Abstract

Technologies in the smart city, such as autonomous vehicles and delivery robots, promise to increase the mobility and freedom of people with disabilities. These technologies have also failed to “see” or comprehend wheelchair riders, people walking with service animals, and people walking with bicycles—all outliers to machine learning models. Big data and algorithms have been amply critiqued for their biases—harmful and systematic errors—but the harms that arise from AI's inherent inability to handle nuance, context, and exception have been largely overlooked. In this paper, I run two machine learning models across nine cities in the United States to attempt to fill a gap in data about the location of curb ramps. I find that while curb ramp prediction models may achieve up to 88% accuracy, the rate of accuracy varied in context in ways both predictable and unpredictable. I look closely at cases of unpredictable error (outlier bias), by triangulating with aerial and street view imagery. The sampling of cases shows that while it may be possible to conjecture about patterns in these errors, there is nothing clearly systematic. While more data and bigger models might improve the accuracy somewhat, I propose that a bias toward outliers is something fundamental to machine learning models which gravitate to the mean and require unbiased and not missing data. I conclude by arguing that universal design or design for the outliers is imperative for justice in the smart city where algorithms and data are increasingly embedded as infrastructure.

查看原文本刊更多论文

异常值偏差:人工智能对路边坡道、异常值和环境的分类

智能城市中的技术，如自动驾驶汽车和送货机器人，有望增加残疾人的移动性和自由度。这些技术也未能“看到”或理解坐轮椅的人，与服务性动物一起行走的人，以及骑自行车的人——这些都是机器学习模型的异常值。大数据和算法因其偏见——有害的和系统性的错误——而受到了广泛的批评，但人工智能固有的无法处理细微差别、背景和异常所带来的危害在很大程度上被忽视了。在这篇论文中，我在美国的九个城市运行了两个机器学习模型，试图填补关于路边坡道位置的数据空白。我发现，虽然路边匝道预测模型的准确率可以达到88%，但准确率在可预测和不可预测的情况下会有所不同。我通过对航拍和街景图像进行三角测量，密切关注不可预测错误(异常偏差)的情况。案例的抽样表明，虽然有可能推测出这些错误的模式，但没有明显的系统性。虽然更多的数据和更大的模型可能会在一定程度上提高准确性，但我认为对异常值的偏见是机器学习模型的基础，它倾向于平均值，需要无偏和不丢失数据。最后，我认为，在算法和数据越来越多地作为基础设施嵌入的智能城市中，通用设计或为异常值设计对于实现正义至关重要。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Big Data & Society SOCIAL SCIENCES, INTERDISCIPLINARY-

CiteScore

10.90

自引率

10.60%

发文量

审稿时长

11 weeks

期刊介绍： Big Data & Society (BD&S) is an open access, peer-reviewed scholarly journal that publishes interdisciplinary work principally in the social sciences, humanities, and computing and their intersections with the arts and natural sciences. The journal focuses on the implications of Big Data for societies and aims to connect debates about Big Data practices and their effects on various sectors such as academia, social life, industry, business, and government. BD&S considers Big Data as an emerging field of practices, not solely defined by but generative of unique data qualities such as high volume, granularity, data linking, and mining. The journal pays attention to digital content generated both online and offline, encompassing social media, search engines, closed networks (e.g., commercial or government transactions), and open networks like digital archives, open government, and crowdsourced data. Rather than providing a fixed definition of Big Data, BD&S encourages interdisciplinary inquiries, debates, and studies on various topics and themes related to Big Data practices. BD&S seeks contributions that analyze Big Data practices, involve empirical engagements and experiments with innovative methods, and reflect on the consequences of these practices for the representation, realization, and governance of societies. As a digital-only journal, BD&S's platform can accommodate multimedia formats such as complex images, dynamic visualizations, videos, and audio content. The contents of the journal encompass peer-reviewed research articles, colloquia, bookcasts, think pieces, state-of-the-art methods, and work by early career researchers.