Mia Gisselbaek MD , Joana Berger-Estilita MD, PhD , Laurens Minsart MD , Ekin Köselerli MD , Arnout Devos PhD , Francisco Maio Matos PhD , Odmara L. Barreto Chang MD, PhD , Peter Dieckmann PhD , Melanie Suppan MD , Sarah Saxena MD, PhD
{"title":"美国人工智能生成的医院领导图像中的性别差异","authors":"Mia Gisselbaek MD , Joana Berger-Estilita MD, PhD , Laurens Minsart MD , Ekin Köselerli MD , Arnout Devos PhD , Francisco Maio Matos PhD , Odmara L. Barreto Chang MD, PhD , Peter Dieckmann PhD , Melanie Suppan MD , Sarah Saxena MD, PhD","doi":"10.1016/j.mcpdig.2025.100218","DOIUrl":null,"url":null,"abstract":"<div><h3>Objective</h3><div>To evaluate demographic representation in artificial intelligence (AI)–generated images of hospital leadership roles and compare them with real-world data from US hospitals.</div></div><div><h3>Patients and Methods</h3><div>This cross-sectional study, conducted from October 1, 2024 to October 31, 2024, analyzed images generated by 3 AI text-to-image models: Midjourney 6.0, OpenAI ChatGPT DALL-E 3, and Google Gemini Imagen 3. Standardized prompts were used to create 1200 images representing 4 key leadership roles: chief executive officers, chief medical officers, chief nursing officers, and chief financial officers. Real-world demographic data from 4397 US hospitals showed that chief executive officers were 73.2% men; chief financial officers, 65.2% men; chief medical officers, 85.7% men; and chief nursing officers, 9.4% men (overall: 60.1% men). The primary outcome was gender representation, with secondary outcomes including race/ethnicity and age. Two independent reviewers assessed images, with interrater reliability evaluated using Cohen κ.</div></div><div><h3>Results</h3><div>Interrater agreement was high for gender (κ=0.998) and moderate for race/ethnicity (κ=0.670) and age (κ=0.605). DALL-E overrepresented men (86.5%) and White individuals (94.5%). Midjourney showed improved gender balance (69.5% men) but overrepresented White individuals (75.0%). Imagen achieved near gender parity (50.3% men) but remained predominantly White (51.5%). Statistically significant differences were observed across models and between models and real-world demographics.</div></div><div><h3>Conclusion</h3><div>Artificial intelligence text-to-image models reflect and amplify systemic biases, overrepresenting men and White leaders, while underrepresenting diversity. Ethical AI practices, including diverse training data sets and fairness-aware algorithms, are essential to ensure equitable representation in health care leadership.</div></div>","PeriodicalId":74127,"journal":{"name":"Mayo Clinic Proceedings. Digital health","volume":"3 2","pages":"Article 100218"},"PeriodicalIF":0.0000,"publicationDate":"2025-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Gender Disparities in Artificial Intelligence–Generated Images of Hospital Leadership in the United States\",\"authors\":\"Mia Gisselbaek MD , Joana Berger-Estilita MD, PhD , Laurens Minsart MD , Ekin Köselerli MD , Arnout Devos PhD , Francisco Maio Matos PhD , Odmara L. Barreto Chang MD, PhD , Peter Dieckmann PhD , Melanie Suppan MD , Sarah Saxena MD, PhD\",\"doi\":\"10.1016/j.mcpdig.2025.100218\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Objective</h3><div>To evaluate demographic representation in artificial intelligence (AI)–generated images of hospital leadership roles and compare them with real-world data from US hospitals.</div></div><div><h3>Patients and Methods</h3><div>This cross-sectional study, conducted from October 1, 2024 to October 31, 2024, analyzed images generated by 3 AI text-to-image models: Midjourney 6.0, OpenAI ChatGPT DALL-E 3, and Google Gemini Imagen 3. Standardized prompts were used to create 1200 images representing 4 key leadership roles: chief executive officers, chief medical officers, chief nursing officers, and chief financial officers. Real-world demographic data from 4397 US hospitals showed that chief executive officers were 73.2% men; chief financial officers, 65.2% men; chief medical officers, 85.7% men; and chief nursing officers, 9.4% men (overall: 60.1% men). The primary outcome was gender representation, with secondary outcomes including race/ethnicity and age. Two independent reviewers assessed images, with interrater reliability evaluated using Cohen κ.</div></div><div><h3>Results</h3><div>Interrater agreement was high for gender (κ=0.998) and moderate for race/ethnicity (κ=0.670) and age (κ=0.605). DALL-E overrepresented men (86.5%) and White individuals (94.5%). Midjourney showed improved gender balance (69.5% men) but overrepresented White individuals (75.0%). Imagen achieved near gender parity (50.3% men) but remained predominantly White (51.5%). Statistically significant differences were observed across models and between models and real-world demographics.</div></div><div><h3>Conclusion</h3><div>Artificial intelligence text-to-image models reflect and amplify systemic biases, overrepresenting men and White leaders, while underrepresenting diversity. Ethical AI practices, including diverse training data sets and fairness-aware algorithms, are essential to ensure equitable representation in health care leadership.</div></div>\",\"PeriodicalId\":74127,\"journal\":{\"name\":\"Mayo Clinic Proceedings. Digital health\",\"volume\":\"3 2\",\"pages\":\"Article 100218\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-04-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Mayo Clinic Proceedings. Digital health\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2949761225000252\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Mayo Clinic Proceedings. Digital health","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2949761225000252","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Gender Disparities in Artificial Intelligence–Generated Images of Hospital Leadership in the United States
Objective
To evaluate demographic representation in artificial intelligence (AI)–generated images of hospital leadership roles and compare them with real-world data from US hospitals.
Patients and Methods
This cross-sectional study, conducted from October 1, 2024 to October 31, 2024, analyzed images generated by 3 AI text-to-image models: Midjourney 6.0, OpenAI ChatGPT DALL-E 3, and Google Gemini Imagen 3. Standardized prompts were used to create 1200 images representing 4 key leadership roles: chief executive officers, chief medical officers, chief nursing officers, and chief financial officers. Real-world demographic data from 4397 US hospitals showed that chief executive officers were 73.2% men; chief financial officers, 65.2% men; chief medical officers, 85.7% men; and chief nursing officers, 9.4% men (overall: 60.1% men). The primary outcome was gender representation, with secondary outcomes including race/ethnicity and age. Two independent reviewers assessed images, with interrater reliability evaluated using Cohen κ.
Results
Interrater agreement was high for gender (κ=0.998) and moderate for race/ethnicity (κ=0.670) and age (κ=0.605). DALL-E overrepresented men (86.5%) and White individuals (94.5%). Midjourney showed improved gender balance (69.5% men) but overrepresented White individuals (75.0%). Imagen achieved near gender parity (50.3% men) but remained predominantly White (51.5%). Statistically significant differences were observed across models and between models and real-world demographics.
Conclusion
Artificial intelligence text-to-image models reflect and amplify systemic biases, overrepresenting men and White leaders, while underrepresenting diversity. Ethical AI practices, including diverse training data sets and fairness-aware algorithms, are essential to ensure equitable representation in health care leadership.