Best Practices on Big Data Analytics to Address Sex-Specific Biases in Our Understanding of the Etiology, Diagnosis, and Prognosis of Diseases.

IF 7 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY
Su Golder, Karen O'Connor, Yunwen Wang, Robin Stevens, Graciela Gonzalez-Hernandez
{"title":"Best Practices on Big Data Analytics to Address Sex-Specific Biases in Our Understanding of the Etiology, Diagnosis, and Prognosis of Diseases.","authors":"Su Golder, Karen O'Connor, Yunwen Wang, Robin Stevens, Graciela Gonzalez-Hernandez","doi":"10.1146/annurev-biodatasci-122120-025806","DOIUrl":null,"url":null,"abstract":"<p><p>A bias in health research to favor understanding diseases as they present in men can have a grave impact on the health of women. This paper reports on a conceptual review of the literature on machine learning or natural language processing (NLP) techniques to interrogate big data for identifying sex-specific health disparities. We searched Ovid MEDLINE, Embase, and PsycINFO in October 2021 using synonyms and indexing terms for (<i>a</i>) \"women,\" \"men,\" or \"sex\"; (<i>b</i>) \"big data,\" \"artificial intelligence,\" or \"NLP\"; and (<i>c</i>) \"disparities\" or \"differences.\" From 902 records, 22 studies met the inclusion criteria and were analyzed. Results demonstrate that the inclusion by sex is inconsistent and often unreported, although the inclusion of men in these studies is disproportionately less than women. Even though artificial intelligence and NLP techniques are widely applied in healthresearch, few studies use them to take advantage of unstructured text to investigate sex-related differences or disparities. Researchers are increasingly aware of sex-based data bias, but the process toward correction is slow. We reflect on best practices on using big data analytics to address sex-specific biases in understanding the etiology, diagnosis, and prognosis of diseases.</p>","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":"5 ","pages":"251-267"},"PeriodicalIF":7.0000,"publicationDate":"2022-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11524028/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annual Review of Biomedical Data Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1146/annurev-biodatasci-122120-025806","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2022/5/13 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

A bias in health research to favor understanding diseases as they present in men can have a grave impact on the health of women. This paper reports on a conceptual review of the literature on machine learning or natural language processing (NLP) techniques to interrogate big data for identifying sex-specific health disparities. We searched Ovid MEDLINE, Embase, and PsycINFO in October 2021 using synonyms and indexing terms for (a) "women," "men," or "sex"; (b) "big data," "artificial intelligence," or "NLP"; and (c) "disparities" or "differences." From 902 records, 22 studies met the inclusion criteria and were analyzed. Results demonstrate that the inclusion by sex is inconsistent and often unreported, although the inclusion of men in these studies is disproportionately less than women. Even though artificial intelligence and NLP techniques are widely applied in healthresearch, few studies use them to take advantage of unstructured text to investigate sex-related differences or disparities. Researchers are increasingly aware of sex-based data bias, but the process toward correction is slow. We reflect on best practices on using big data analytics to address sex-specific biases in understanding the etiology, diagnosis, and prognosis of diseases.

大数据分析的最佳实践,以解决我们在了解疾病的病因、诊断和预后时存在的性别偏见。
健康研究中偏向于理解男性所患疾病的偏见会对女性的健康产生严重影响。本文对有关机器学习或自然语言处理(NLP)技术的文献进行了概念性综述,这些技术用于查询大数据以识别特定性别的健康差异。我们在 2021 年 10 月使用以下同义词和索引词对 Ovid MEDLINE、Embase 和 PsycINFO 进行了检索:(a) "女性"、"男性 "或 "性别";(b) "大数据"、"人工智能 "或 "NLP";(c) "差异 "或 "差别"。在 902 条记录中,有 22 项研究符合纳入标准并进行了分析。结果表明,按性别纳入研究的情况并不一致,而且往往未作报告,尽管男性在这些研究中的比例比女性少得多。尽管人工智能和 NLP 技术已广泛应用于健康研究,但很少有研究利用它们来研究非结构化文本中与性别相关的差异或差距。研究人员越来越意识到基于性别的数据偏差,但纠正过程却很缓慢。我们反思了在了解疾病的病因、诊断和预后方面使用大数据分析来解决性别偏见的最佳实践。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
11.10
自引率
1.70%
发文量
0
期刊介绍: The Annual Review of Biomedical Data Science provides comprehensive expert reviews in biomedical data science, focusing on advanced methods to store, retrieve, analyze, and organize biomedical data and knowledge. The scope of the journal encompasses informatics, computational, artificial intelligence (AI), and statistical approaches to biomedical data, including the sub-fields of bioinformatics, computational biology, biomedical informatics, clinical and clinical research informatics, biostatistics, and imaging informatics. The mission of the journal is to identify both emerging and established areas of biomedical data science, and the leaders in these fields.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信