Understanding and evaluating harms of AI-generated image captions in political images

IF 2.3 Q1 INTERNATIONAL RELATIONS
Habiba Sarhan, Simon Hegelich
{"title":"Understanding and evaluating harms of AI-generated image captions in political images","authors":"Habiba Sarhan, Simon Hegelich","doi":"10.3389/fpos.2023.1245684","DOIUrl":null,"url":null,"abstract":"The use of AI-generated image captions has been increasing. Scholars of disability studies have long studied accessibility and AI issues concerning technology bias, focusing on image captions and tags. However, less attention has been paid to the individuals and social groups depicted in images and captioned using AI. Further research is needed to understand the underlying representational harms that could affect these social groups. This paper investigates the potential representational harms to social groups depicted in images. There is a high risk of harming certain social groups, either by stereotypical descriptions or erasing their identities from the caption, which could affect the understandings, beliefs, and attitudes that people hold about these specific groups. For the purpose of this article, 1,000 images with human-annotated captions were collected from news agencies “politics” sections. Microsoft's Azure Cloud Services was used to generate AI-generated captions with the December 2021 public version. The pattern observed from the politically salient images gathered and their captions highlight the tendency of the model used to generate more generic descriptions, which may potentially harm misrepresented social groups. Consequently, a balance between those harms needs to be struck, which is intertwined with the trade-off between generating generic vs. specific descriptions. The decision to generate generic descriptions, being extra cautious not to use stereotypes, erases and demeans excluded and already underrepresented social groups, while the decision to generate specific descriptions stereotypes social groups as well as reifies them. The appropriate trade-off is, therefore, crucial, especially when examining politically salient images.","PeriodicalId":34431,"journal":{"name":"Frontiers in Political Science","volume":null,"pages":null},"PeriodicalIF":2.3000,"publicationDate":"2023-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Political Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/fpos.2023.1245684","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"INTERNATIONAL RELATIONS","Score":null,"Total":0}
引用次数: 1

Abstract

The use of AI-generated image captions has been increasing. Scholars of disability studies have long studied accessibility and AI issues concerning technology bias, focusing on image captions and tags. However, less attention has been paid to the individuals and social groups depicted in images and captioned using AI. Further research is needed to understand the underlying representational harms that could affect these social groups. This paper investigates the potential representational harms to social groups depicted in images. There is a high risk of harming certain social groups, either by stereotypical descriptions or erasing their identities from the caption, which could affect the understandings, beliefs, and attitudes that people hold about these specific groups. For the purpose of this article, 1,000 images with human-annotated captions were collected from news agencies “politics” sections. Microsoft's Azure Cloud Services was used to generate AI-generated captions with the December 2021 public version. The pattern observed from the politically salient images gathered and their captions highlight the tendency of the model used to generate more generic descriptions, which may potentially harm misrepresented social groups. Consequently, a balance between those harms needs to be struck, which is intertwined with the trade-off between generating generic vs. specific descriptions. The decision to generate generic descriptions, being extra cautious not to use stereotypes, erases and demeans excluded and already underrepresented social groups, while the decision to generate specific descriptions stereotypes social groups as well as reifies them. The appropriate trade-off is, therefore, crucial, especially when examining politically salient images.
理解和评估人工智能生成的政治图像标题的危害
人工智能生成的图像标题的使用一直在增加。残障研究学者长期以来一直在研究有关技术偏见的可访问性和人工智能问题,重点关注图像标题和标签。然而,人们很少关注图像中描绘的个人和社会群体,并使用人工智能进行说明。需要进一步的研究来了解可能影响这些社会群体的潜在代表性危害。本文调查了图像中描绘的社会群体的潜在代表性危害。通过刻板的描述或从标题中删除他们的身份,这可能会影响人们对这些特定群体的理解、信仰和态度,因此伤害某些社会群体的风险很高。为了本文的目的,从新闻机构的“政治”部分收集了1000张带有人工注释的图片。微软的Azure云服务在2021年12月的公开版本中用于生成人工智能生成的字幕。从收集到的政治上突出的图像及其说明文字中观察到的模式突出了用于生成更多通用描述的模型的趋势,这可能会潜在地伤害被歪曲的社会群体。因此,需要在这些危害之间取得平衡,这与生成通用描述与特定描述之间的权衡交织在一起。产生一般性描述的决定,特别小心地不使用刻板印象,消除和贬低被排斥和已经代表性不足的社会群体,而产生特定描述的决定,刻板印象社会群体,并具体化。因此,适当的取舍至关重要,尤其是在审视政治上引人注目的形象时。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Frontiers in Political Science
Frontiers in Political Science Social Sciences-Political Science and International Relations
CiteScore
2.90
自引率
0.00%
发文量
135
审稿时长
13 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信