美国人工智能生成的医院领导图像中的性别差异

Mia Gisselbaek MD , Joana Berger-Estilita MD, PhD , Laurens Minsart MD , Ekin Köselerli MD , Arnout Devos PhD , Francisco Maio Matos PhD , Odmara L. Barreto Chang MD, PhD , Peter Dieckmann PhD , Melanie Suppan MD , Sarah Saxena MD, PhD
{"title":"美国人工智能生成的医院领导图像中的性别差异","authors":"Mia Gisselbaek MD ,&nbsp;Joana Berger-Estilita MD, PhD ,&nbsp;Laurens Minsart MD ,&nbsp;Ekin Köselerli MD ,&nbsp;Arnout Devos PhD ,&nbsp;Francisco Maio Matos PhD ,&nbsp;Odmara L. Barreto Chang MD, PhD ,&nbsp;Peter Dieckmann PhD ,&nbsp;Melanie Suppan MD ,&nbsp;Sarah Saxena MD, PhD","doi":"10.1016/j.mcpdig.2025.100218","DOIUrl":null,"url":null,"abstract":"<div><h3>Objective</h3><div>To evaluate demographic representation in artificial intelligence (AI)–generated images of hospital leadership roles and compare them with real-world data from US hospitals.</div></div><div><h3>Patients and Methods</h3><div>This cross-sectional study, conducted from October 1, 2024 to October 31, 2024, analyzed images generated by 3 AI text-to-image models: Midjourney 6.0, OpenAI ChatGPT DALL-E 3, and Google Gemini Imagen 3. Standardized prompts were used to create 1200 images representing 4 key leadership roles: chief executive officers, chief medical officers, chief nursing officers, and chief financial officers. Real-world demographic data from 4397 US hospitals showed that chief executive officers were 73.2% men; chief financial officers, 65.2% men; chief medical officers, 85.7% men; and chief nursing officers, 9.4% men (overall: 60.1% men). The primary outcome was gender representation, with secondary outcomes including race/ethnicity and age. Two independent reviewers assessed images, with interrater reliability evaluated using Cohen κ.</div></div><div><h3>Results</h3><div>Interrater agreement was high for gender (κ=0.998) and moderate for race/ethnicity (κ=0.670) and age (κ=0.605). DALL-E overrepresented men (86.5%) and White individuals (94.5%). Midjourney showed improved gender balance (69.5% men) but overrepresented White individuals (75.0%). Imagen achieved near gender parity (50.3% men) but remained predominantly White (51.5%). Statistically significant differences were observed across models and between models and real-world demographics.</div></div><div><h3>Conclusion</h3><div>Artificial intelligence text-to-image models reflect and amplify systemic biases, overrepresenting men and White leaders, while underrepresenting diversity. Ethical AI practices, including diverse training data sets and fairness-aware algorithms, are essential to ensure equitable representation in health care leadership.</div></div>","PeriodicalId":74127,"journal":{"name":"Mayo Clinic Proceedings. Digital health","volume":"3 2","pages":"Article 100218"},"PeriodicalIF":0.0000,"publicationDate":"2025-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Gender Disparities in Artificial Intelligence–Generated Images of Hospital Leadership in the United States\",\"authors\":\"Mia Gisselbaek MD ,&nbsp;Joana Berger-Estilita MD, PhD ,&nbsp;Laurens Minsart MD ,&nbsp;Ekin Köselerli MD ,&nbsp;Arnout Devos PhD ,&nbsp;Francisco Maio Matos PhD ,&nbsp;Odmara L. Barreto Chang MD, PhD ,&nbsp;Peter Dieckmann PhD ,&nbsp;Melanie Suppan MD ,&nbsp;Sarah Saxena MD, PhD\",\"doi\":\"10.1016/j.mcpdig.2025.100218\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Objective</h3><div>To evaluate demographic representation in artificial intelligence (AI)–generated images of hospital leadership roles and compare them with real-world data from US hospitals.</div></div><div><h3>Patients and Methods</h3><div>This cross-sectional study, conducted from October 1, 2024 to October 31, 2024, analyzed images generated by 3 AI text-to-image models: Midjourney 6.0, OpenAI ChatGPT DALL-E 3, and Google Gemini Imagen 3. Standardized prompts were used to create 1200 images representing 4 key leadership roles: chief executive officers, chief medical officers, chief nursing officers, and chief financial officers. Real-world demographic data from 4397 US hospitals showed that chief executive officers were 73.2% men; chief financial officers, 65.2% men; chief medical officers, 85.7% men; and chief nursing officers, 9.4% men (overall: 60.1% men). The primary outcome was gender representation, with secondary outcomes including race/ethnicity and age. Two independent reviewers assessed images, with interrater reliability evaluated using Cohen κ.</div></div><div><h3>Results</h3><div>Interrater agreement was high for gender (κ=0.998) and moderate for race/ethnicity (κ=0.670) and age (κ=0.605). DALL-E overrepresented men (86.5%) and White individuals (94.5%). Midjourney showed improved gender balance (69.5% men) but overrepresented White individuals (75.0%). Imagen achieved near gender parity (50.3% men) but remained predominantly White (51.5%). Statistically significant differences were observed across models and between models and real-world demographics.</div></div><div><h3>Conclusion</h3><div>Artificial intelligence text-to-image models reflect and amplify systemic biases, overrepresenting men and White leaders, while underrepresenting diversity. Ethical AI practices, including diverse training data sets and fairness-aware algorithms, are essential to ensure equitable representation in health care leadership.</div></div>\",\"PeriodicalId\":74127,\"journal\":{\"name\":\"Mayo Clinic Proceedings. Digital health\",\"volume\":\"3 2\",\"pages\":\"Article 100218\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-04-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Mayo Clinic Proceedings. Digital health\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2949761225000252\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Mayo Clinic Proceedings. Digital health","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2949761225000252","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

目的评估人工智能(AI)生成的医院领导角色图像中的人口统计学代表性,并将其与来自美国医院的真实数据进行比较。患者和方法本横断面研究于2024年10月1日至2024年10月31日进行,分析了3种AI文本到图像模型生成的图像:Midjourney 6.0、OpenAI ChatGPT DALL-E 3和谷歌Gemini Imagen 3。使用标准化提示创建了1200个代表4个关键领导角色的图像:首席执行官、首席医疗官、首席护理官和首席财务官。来自美国4397家医院的真实人口统计数据显示,首席执行官中有73.2%是男性;首席财务官中,男性占65.2%;首席医务官,85.7%为男性;首席护理官中,9.4%是男性(总体:60.1%是男性)。主要结果是性别代表性,次要结果包括种族/民族和年龄。两名独立审稿人对图像进行评估,使用Cohen κ评估图像间信度。结果性别间的一致性较高(κ=0.998),种族/民族间的一致性中等(κ=0.670),年龄间的一致性中等(κ=0.605)。DALL-E在男性(86.5%)和白人(94.5%)中比例过高。中期显示性别平衡有所改善(69.5%为男性),但白人个体比例过高(75.0%)。Imagen几乎实现了性别平等(50.3%的男性),但仍以白人为主(51.5%)。在模型之间以及模型与现实世界人口统计数据之间观察到统计学上的显著差异。人工智能文本到图像模型反映并放大了系统性偏见,过度代表男性和白人领导者,而低估了多样性。道德人工智能实践,包括各种训练数据集和公平意识算法,对于确保卫生保健领导层的公平代表性至关重要。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Gender Disparities in Artificial Intelligence–Generated Images of Hospital Leadership in the United States

Objective

To evaluate demographic representation in artificial intelligence (AI)–generated images of hospital leadership roles and compare them with real-world data from US hospitals.

Patients and Methods

This cross-sectional study, conducted from October 1, 2024 to October 31, 2024, analyzed images generated by 3 AI text-to-image models: Midjourney 6.0, OpenAI ChatGPT DALL-E 3, and Google Gemini Imagen 3. Standardized prompts were used to create 1200 images representing 4 key leadership roles: chief executive officers, chief medical officers, chief nursing officers, and chief financial officers. Real-world demographic data from 4397 US hospitals showed that chief executive officers were 73.2% men; chief financial officers, 65.2% men; chief medical officers, 85.7% men; and chief nursing officers, 9.4% men (overall: 60.1% men). The primary outcome was gender representation, with secondary outcomes including race/ethnicity and age. Two independent reviewers assessed images, with interrater reliability evaluated using Cohen κ.

Results

Interrater agreement was high for gender (κ=0.998) and moderate for race/ethnicity (κ=0.670) and age (κ=0.605). DALL-E overrepresented men (86.5%) and White individuals (94.5%). Midjourney showed improved gender balance (69.5% men) but overrepresented White individuals (75.0%). Imagen achieved near gender parity (50.3% men) but remained predominantly White (51.5%). Statistically significant differences were observed across models and between models and real-world demographics.

Conclusion

Artificial intelligence text-to-image models reflect and amplify systemic biases, overrepresenting men and White leaders, while underrepresenting diversity. Ethical AI practices, including diverse training data sets and fairness-aware algorithms, are essential to ensure equitable representation in health care leadership.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Mayo Clinic Proceedings. Digital health
Mayo Clinic Proceedings. Digital health Medicine and Dentistry (General), Health Informatics, Public Health and Health Policy
自引率
0.00%
发文量
0
审稿时长
47 days
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信