关于机器学习和放射科医生的癌症文献中种族/民族报告的差异:系统回顾和荟萃分析

Journal of medical artificial intelligence Pub Date : 2023-11-01 DOI:10.21037/jmai-23-31

Rahil Patel, Destie Provenzano, Sherrie Flynt Wallington, Murray Loew, Yuan James Rao, Sharad Goyal

{"title":"关于机器学习和放射科医生的癌症文献中种族/民族报告的差异:系统回顾和荟萃分析","authors":"Rahil Patel, Destie Provenzano, Sherrie Flynt Wallington, Murray Loew, Yuan James Rao, Sharad Goyal","doi":"10.21037/jmai-23-31","DOIUrl":null,"url":null,"abstract":"Background: Machine learning (ML) has emerged as a promising tool to assist physicians in diagnosis and classification of patient conditions from medical imaging data. However, as clinical applications of ML become more common, there is concern about the prevalence of ethnoracial biases due to improper algorithm training. It has long been known that cancer outcomes vary for different racial/ethnic groups. Methods: We reviewed 84 studies that reported results of ML algorithms compared to radiologists for cancer prediction to evaluate if algorithms targeted at cancer prediction account for potential ethnoracial biases in their training samples. The search engines used to extract the articles were: PubMed, MEDLINE, and Google Scholar. All studies published before May 2022 were extracted. Two researchers independently reviewed 115 articles and evaluated them for incorporation and inclusion of demographic information in the algorithm. Exclusion criteria were if an inappropriate imaging type was used, if they did not report benign vs. malignant cancer results, if the algorithm was not compared to a board-certified radiologist, or if they were not in English. Results: Of the 84 studies included, 87% (n=73) reported demographic information and 38% (n=32) evaluated the effect of demographic information on model performance. However, only about 11% (n=9) of the articles reported racial/ethnic groups and about 4% (n=3) incorporated racial/ethnic information into their models. Of the nine studies that reported racial/ethnic information, the specified racial/ethnic minorities that were included the most were White/Caucasian (n=9/9) and Black/African American (n=8/9). Asian (n=4/9), American Indian (n=3/9), and Hispanic (n=2/9) were reported in less than half of the studies. Conclusions: The lack of inclusion of not only racial/ethnic information but also other demographic information such as age, gender, body mass index (BMI), or patient history is indicative of a larger problem that exists within artificial intelligence (AI) for cancer imaging. It is crucial to report and consider demographics when considering not only AI for cancer, but also overall care of a cancer patient. The findings from this study highlight a need for greater consideration and evaluation of ML algorithms to consider demographic information when evaluating a patient population for training the algorithm.","PeriodicalId":73815,"journal":{"name":"Journal of medical artificial intelligence","volume":"102 5-6","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Racial/ethnic reporting differences in cancer literature regarding machine learning vs. a radiologist: a systematic review and meta- analysis\",\"authors\":\"Rahil Patel, Destie Provenzano, Sherrie Flynt Wallington, Murray Loew, Yuan James Rao, Sharad Goyal\",\"doi\":\"10.21037/jmai-23-31\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background: Machine learning (ML) has emerged as a promising tool to assist physicians in diagnosis and classification of patient conditions from medical imaging data. However, as clinical applications of ML become more common, there is concern about the prevalence of ethnoracial biases due to improper algorithm training. It has long been known that cancer outcomes vary for different racial/ethnic groups. Methods: We reviewed 84 studies that reported results of ML algorithms compared to radiologists for cancer prediction to evaluate if algorithms targeted at cancer prediction account for potential ethnoracial biases in their training samples. The search engines used to extract the articles were: PubMed, MEDLINE, and Google Scholar. All studies published before May 2022 were extracted. Two researchers independently reviewed 115 articles and evaluated them for incorporation and inclusion of demographic information in the algorithm. Exclusion criteria were if an inappropriate imaging type was used, if they did not report benign vs. malignant cancer results, if the algorithm was not compared to a board-certified radiologist, or if they were not in English. Results: Of the 84 studies included, 87% (n=73) reported demographic information and 38% (n=32) evaluated the effect of demographic information on model performance. However, only about 11% (n=9) of the articles reported racial/ethnic groups and about 4% (n=3) incorporated racial/ethnic information into their models. Of the nine studies that reported racial/ethnic information, the specified racial/ethnic minorities that were included the most were White/Caucasian (n=9/9) and Black/African American (n=8/9). Asian (n=4/9), American Indian (n=3/9), and Hispanic (n=2/9) were reported in less than half of the studies. Conclusions: The lack of inclusion of not only racial/ethnic information but also other demographic information such as age, gender, body mass index (BMI), or patient history is indicative of a larger problem that exists within artificial intelligence (AI) for cancer imaging. It is crucial to report and consider demographics when considering not only AI for cancer, but also overall care of a cancer patient. The findings from this study highlight a need for greater consideration and evaluation of ML algorithms to consider demographic information when evaluating a patient population for training the algorithm.\",\"PeriodicalId\":73815,\"journal\":{\"name\":\"Journal of medical artificial intelligence\",\"volume\":\"102 5-6\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of medical artificial intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.21037/jmai-23-31\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of medical artificial intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21037/jmai-23-31","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

背景:机器学习(ML)已经成为一种很有前途的工具，可以帮助医生从医学成像数据中诊断和分类患者的病情。然而，随着ML的临床应用越来越普遍，人们担心由于算法训练不当而导致种族偏见的普遍存在。人们早就知道，不同种族/民族的癌症结果是不同的。方法:我们回顾了84项研究，这些研究报告了ML算法与放射科医生在癌症预测方面的结果，以评估针对癌症预测的算法是否可以解释其训练样本中潜在的种族偏见。用于提取文章的搜索引擎是:PubMed, MEDLINE和Google Scholar。提取2022年5月之前发表的所有研究。两名研究人员独立审查了115篇文章，并对其在算法中纳入人口统计信息的情况进行了评估。排除标准是:使用了不适当的成像类型，没有报告良性和恶性癌症的结果，没有将算法与委员会认证的放射科医生进行比较，或者没有使用英语。结果:纳入的84项研究中，87% (n=73)报告了人口统计信息，38% (n=32)评估了人口统计信息对模型性能的影响。然而，只有约11% (n=9)的文章报告了种族/民族群体，约4% (n=3)的文章将种族/民族信息纳入其模型。在报告种族/民族信息的9项研究中，被纳入最多的特定种族/少数民族是白人/高加索人(n=9/9)和黑人/非裔美国人(n=8/9)。亚洲人(n=4/9)、美洲印第安人(n=3/9)和西班牙人(n=2/9)在不到一半的研究中被报道。结论:不仅缺乏种族/民族信息，而且缺乏其他人口统计信息，如年龄、性别、体重指数(BMI)或患者病史，这表明人工智能(AI)在癌症成像中存在更大的问题。在考虑人工智能治疗癌症，以及癌症患者的整体护理时，报告和考虑人口统计数据至关重要。这项研究的结果强调了在评估用于训练算法的患者群体时，需要更多地考虑和评估ML算法，以考虑人口统计信息。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Racial/ethnic reporting differences in cancer literature regarding machine learning vs. a radiologist: a systematic review and meta- analysis

Background: Machine learning (ML) has emerged as a promising tool to assist physicians in diagnosis and classification of patient conditions from medical imaging data. However, as clinical applications of ML become more common, there is concern about the prevalence of ethnoracial biases due to improper algorithm training. It has long been known that cancer outcomes vary for different racial/ethnic groups. Methods: We reviewed 84 studies that reported results of ML algorithms compared to radiologists for cancer prediction to evaluate if algorithms targeted at cancer prediction account for potential ethnoracial biases in their training samples. The search engines used to extract the articles were: PubMed, MEDLINE, and Google Scholar. All studies published before May 2022 were extracted. Two researchers independently reviewed 115 articles and evaluated them for incorporation and inclusion of demographic information in the algorithm. Exclusion criteria were if an inappropriate imaging type was used, if they did not report benign vs. malignant cancer results, if the algorithm was not compared to a board-certified radiologist, or if they were not in English. Results: Of the 84 studies included, 87% (n=73) reported demographic information and 38% (n=32) evaluated the effect of demographic information on model performance. However, only about 11% (n=9) of the articles reported racial/ethnic groups and about 4% (n=3) incorporated racial/ethnic information into their models. Of the nine studies that reported racial/ethnic information, the specified racial/ethnic minorities that were included the most were White/Caucasian (n=9/9) and Black/African American (n=8/9). Asian (n=4/9), American Indian (n=3/9), and Hispanic (n=2/9) were reported in less than half of the studies. Conclusions: The lack of inclusion of not only racial/ethnic information but also other demographic information such as age, gender, body mass index (BMI), or patient history is indicative of a larger problem that exists within artificial intelligence (AI) for cancer imaging. It is crucial to report and consider demographics when considering not only AI for cancer, but also overall care of a cancer patient. The findings from this study highlight a need for greater consideration and evaluation of ML algorithms to consider demographic information when evaluating a patient population for training the algorithm.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of medical artificial intelligence

CiteScore

2.30

自引率

0.00%

发文量