医学中的生成式人工智能：开创性的进步还是延续历史的不准确性？评估内隐偏倚的横断面研究。

IF 2

JMIR AI Pub Date : 2025-06-24 DOI:10.2196/56891

Philip Sutera, Rohini Bhatia, Timothy Lin, Leslie Chang, Andrea Brown, Reshma Jagsi

{"title":"医学中的生成式人工智能：开创性的进步还是延续历史的不准确性？评估内隐偏倚的横断面研究。","authors":"Philip Sutera, Rohini Bhatia, Timothy Lin, Leslie Chang, Andrea Brown, Reshma Jagsi","doi":"10.2196/56891","DOIUrl":null,"url":null,"abstract":"Background: Generative artificial intelligence (gAI) models, such as DALL-E 2, are promising tools that can generate novel images or artwork based on text input. However, caution is warranted, as these tools generate information based on historical data and are thus at risk of propagating past learned inequities. Women in medicine have routinely been underrepresented in academic and clinical medicine and the stereotype of a male physician persists.Objective: The primary objective is to evaluate implicit bias among gAI across medical specialties.Methods: To evaluate for potential implicit bias, 100 photographs for each medical specialty were generated using the gAI platform DALL-E2. For each specialty, DALL-E2 was queried with \"An American [specialty name].\" Our primary endpoint was to compare the gender distribution of gAI photos to the current distribution in the United States. Our secondary endpoint included evaluating the racial distribution. gAI photos were classified according to perceived gender and race based on a unanimous consensus among a diverse group of medical residents. The proportion of gAI women subjects was compared for each medical specialty to the most recent Association of American Medical Colleges report for physician workforce and active residents using χ2 analysis.Results: A total of 1900 photos across 19 medical specialties were generated. Compared to physician workforce data, AI significantly overrepresented women in 7/19 specialties and underrepresented women in 6/19 specialties. Women were significantly underrepresented compared to the physician workforce by 18%, 18%, and 27% in internal medicine, family medicine, and pediatrics, respectively. Compared to current residents, AI significantly underrepresented women in 12/19 specialties, ranging from 10% to 36%. Additionally, women represented <50% of the demographic for 17/19 specialties by gAI.Conclusions: gAI created a sample population of physicians that underrepresented women when compared to both the resident and active physician workforce. Steps must be taken to train datasets in order to represent the diversity of the incoming physician workforce.","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"4 ","pages":"e56891"},"PeriodicalIF":2.0000,"publicationDate":"2025-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12223688/pdf/","citationCount":"0","resultStr":"{\"title\":\"Generative AI in Medicine: Pioneering Progress or Perpetuating Historical Inaccuracies? Cross-Sectional Study Evaluating Implicit Bias.\",\"authors\":\"Philip Sutera, Rohini Bhatia, Timothy Lin, Leslie Chang, Andrea Brown, Reshma Jagsi\",\"doi\":\"10.2196/56891\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background: Generative artificial intelligence (gAI) models, such as DALL-E 2, are promising tools that can generate novel images or artwork based on text input. However, caution is warranted, as these tools generate information based on historical data and are thus at risk of propagating past learned inequities. Women in medicine have routinely been underrepresented in academic and clinical medicine and the stereotype of a male physician persists.Objective: The primary objective is to evaluate implicit bias among gAI across medical specialties.Methods: To evaluate for potential implicit bias, 100 photographs for each medical specialty were generated using the gAI platform DALL-E2. For each specialty, DALL-E2 was queried with \\\"An American [specialty name].\\\" Our primary endpoint was to compare the gender distribution of gAI photos to the current distribution in the United States. Our secondary endpoint included evaluating the racial distribution. gAI photos were classified according to perceived gender and race based on a unanimous consensus among a diverse group of medical residents. The proportion of gAI women subjects was compared for each medical specialty to the most recent Association of American Medical Colleges report for physician workforce and active residents using χ2 analysis.Results: A total of 1900 photos across 19 medical specialties were generated. Compared to physician workforce data, AI significantly overrepresented women in 7/19 specialties and underrepresented women in 6/19 specialties. Women were significantly underrepresented compared to the physician workforce by 18%, 18%, and 27% in internal medicine, family medicine, and pediatrics, respectively. Compared to current residents, AI significantly underrepresented women in 12/19 specialties, ranging from 10% to 36%. Additionally, women represented <50% of the demographic for 17/19 specialties by gAI.Conclusions: gAI created a sample population of physicians that underrepresented women when compared to both the resident and active physician workforce. Steps must be taken to train datasets in order to represent the diversity of the incoming physician workforce.\",\"PeriodicalId\":73551,\"journal\":{\"name\":\"JMIR AI\",\"volume\":\"4 \",\"pages\":\"e56891\"},\"PeriodicalIF\":2.0000,\"publicationDate\":\"2025-06-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12223688/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"JMIR AI\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2196/56891\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"JMIR AI","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2196/56891","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

背景：生成式人工智能（gAI）模型，如dall - e2，是一种很有前途的工具，可以根据文本输入生成新颖的图像或艺术品。然而，谨慎是有必要的，因为这些工具生成的信息是基于历史数据的，因此有传播过去学到的不平等的风险。医学领域的女性在学术和临床医学领域的代表性不足，男性医生的刻板印象仍然存在。目的：主要目的是评估跨医学专业gAI的内隐偏倚。方法：为了评估潜在的内隐偏倚，使用gAI平台DALL-E2为每个医学专业生成100张照片。对于每个专业，使用“美国[专业名称]”查询DALL-E2。我们的主要终点是比较gAI照片的性别分布与美国目前的分布。我们的次要终点包括评估种族分布。gAI照片根据感知到的性别和种族进行分类，这是基于不同医疗住院医师群体的一致共识。采用χ2分析，将每个医学专业的gAI女性受试者比例与最新的美国医学院协会关于医师劳动力和活跃住院医师的报告进行比较。结果：共生成19个医学专业的1900张照片。与医生劳动力数据相比，人工智能在7/19个专业中女性比例明显过高，在6/19个专业中女性比例不足。与医生相比，女性在内科、家庭医学和儿科的比例分别为18%、18%和27%。与目前的居民相比，人工智能在12/19个专业中的女性比例明显不足，从10%到36%不等。结论：与住院医师和在职医师相比，gAI创建了一个代表性不足的医生样本人群。必须采取步骤来训练数据集，以代表即将到来的医生队伍的多样性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Generative AI in Medicine: Pioneering Progress or Perpetuating Historical Inaccuracies? Cross-Sectional Study Evaluating Implicit Bias.

查看原文本刊更多论文

Generative AI in Medicine: Pioneering Progress or Perpetuating Historical Inaccuracies? Cross-Sectional Study Evaluating Implicit Bias.

Background: Generative artificial intelligence (gAI) models, such as DALL-E 2, are promising tools that can generate novel images or artwork based on text input. However, caution is warranted, as these tools generate information based on historical data and are thus at risk of propagating past learned inequities. Women in medicine have routinely been underrepresented in academic and clinical medicine and the stereotype of a male physician persists.

Objective: The primary objective is to evaluate implicit bias among gAI across medical specialties.

Methods: To evaluate for potential implicit bias, 100 photographs for each medical specialty were generated using the gAI platform DALL-E2. For each specialty, DALL-E2 was queried with "An American [specialty name]." Our primary endpoint was to compare the gender distribution of gAI photos to the current distribution in the United States. Our secondary endpoint included evaluating the racial distribution. gAI photos were classified according to perceived gender and race based on a unanimous consensus among a diverse group of medical residents. The proportion of gAI women subjects was compared for each medical specialty to the most recent Association of American Medical Colleges report for physician workforce and active residents using χ2 analysis.

Results: A total of 1900 photos across 19 medical specialties were generated. Compared to physician workforce data, AI significantly overrepresented women in 7/19 specialties and underrepresented women in 6/19 specialties. Women were significantly underrepresented compared to the physician workforce by 18%, 18%, and 27% in internal medicine, family medicine, and pediatrics, respectively. Compared to current residents, AI significantly underrepresented women in 12/19 specialties, ranging from 10% to 36%. Additionally, women represented <50% of the demographic for 17/19 specialties by gAI.

Conclusions: gAI created a sample population of physicians that underrepresented women when compared to both the resident and active physician workforce. Steps must be taken to train datasets in order to represent the diversity of the incoming physician workforce.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

JMIR AI

自引率

0.00%

发文量