大数据分析的最佳实践,以解决我们对疾病病因、诊断和预后的理解中的性别特异性偏差

IF 7 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY
S. Golder, K. O’Connor, Yunwen Wang, R. Stevens, G. Gonzalez-Hernandez
{"title":"大数据分析的最佳实践,以解决我们对疾病病因、诊断和预后的理解中的性别特异性偏差","authors":"S. Golder, K. O’Connor, Yunwen Wang, R. Stevens, G. Gonzalez-Hernandez","doi":"10.1101/2022.01.31.22270183","DOIUrl":null,"url":null,"abstract":"A bias in health research to favor understanding of diseases as they present in men can have a grave impact on the health of women. This paper reports on a conceptual review of the literature that used machine learning or NLP techniques to interrogate big data for identifying sex-specific health disparities. We searched Ovid MEDLINE, Embase, and PsycINFO in October 2021 using synonyms and indexing terms for (1) \"women\" or \"men\" or \"sex,\" (2) \"big data\" or \"artificial intelligence\" or \"NLP\", and (3) \"disparities\" or \"differences.\" From 902 records, 22 studies met the inclusion criteria and were analyzed. Results demonstrate that the inclusion by sex is inconsistent and often unreported, although the inclusion of men in the included studies is disproportionately less than women. Even though AI and NLP techniques are widely applied in health research, few studies use them to take advatage of unstructured text to investigate sex-related differences or disparities. Researchers are increasingly aware of sex-based data bias, but the process to- wards correction is slow. We reflected on what would be the best practices on using big data analytics to address sex-specific biases in understanding the etiology, diagnosis, and prognosis of diseases.","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":null,"pages":null},"PeriodicalIF":7.0000,"publicationDate":"2022-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Best Practices on Big Data Analytics to Address Sex-Specific Biases in our Understanding of the Etiology, Diagnosis and Prognosis of Diseases\",\"authors\":\"S. Golder, K. O’Connor, Yunwen Wang, R. Stevens, G. Gonzalez-Hernandez\",\"doi\":\"10.1101/2022.01.31.22270183\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A bias in health research to favor understanding of diseases as they present in men can have a grave impact on the health of women. This paper reports on a conceptual review of the literature that used machine learning or NLP techniques to interrogate big data for identifying sex-specific health disparities. We searched Ovid MEDLINE, Embase, and PsycINFO in October 2021 using synonyms and indexing terms for (1) \\\"women\\\" or \\\"men\\\" or \\\"sex,\\\" (2) \\\"big data\\\" or \\\"artificial intelligence\\\" or \\\"NLP\\\", and (3) \\\"disparities\\\" or \\\"differences.\\\" From 902 records, 22 studies met the inclusion criteria and were analyzed. Results demonstrate that the inclusion by sex is inconsistent and often unreported, although the inclusion of men in the included studies is disproportionately less than women. Even though AI and NLP techniques are widely applied in health research, few studies use them to take advatage of unstructured text to investigate sex-related differences or disparities. Researchers are increasingly aware of sex-based data bias, but the process to- wards correction is slow. We reflected on what would be the best practices on using big data analytics to address sex-specific biases in understanding the etiology, diagnosis, and prognosis of diseases.\",\"PeriodicalId\":29775,\"journal\":{\"name\":\"Annual Review of Biomedical Data Science\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":7.0000,\"publicationDate\":\"2022-02-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Annual Review of Biomedical Data Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1101/2022.01.31.22270183\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MATHEMATICAL & COMPUTATIONAL BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annual Review of Biomedical Data Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2022.01.31.22270183","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
引用次数: 1

摘要

健康研究中倾向于理解男性疾病的偏见可能会对女性健康产生严重影响。本文报告了对使用机器学习或NLP技术询问大数据以识别性别特定健康差异的文献的概念性综述。2021年10月,我们使用同义词和索引词搜索了Ovid MEDLINE、Embase和PsycINFO,分别为(1)“女性”或“男性”或“性别”,(2)“大数据”或“人工智能”或“NLP”,以及(3)“差异”或“差异”。从902份记录中,有22项研究符合纳入标准并进行了分析。结果表明,按性别划分的纳入情况是不一致的,而且往往没有报告,尽管纳入研究的男性比例远远低于女性。尽管人工智能和NLP技术在健康研究中得到了广泛应用,但很少有研究使用它们来支持非结构化文本来调查与性别相关的差异或差异。研究人员越来越意识到基于性别的数据偏见,但纠正过程很慢。我们思考了使用大数据分析来解决在理解疾病病因、诊断和预后方面存在的性别偏见的最佳做法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Best Practices on Big Data Analytics to Address Sex-Specific Biases in our Understanding of the Etiology, Diagnosis and Prognosis of Diseases
A bias in health research to favor understanding of diseases as they present in men can have a grave impact on the health of women. This paper reports on a conceptual review of the literature that used machine learning or NLP techniques to interrogate big data for identifying sex-specific health disparities. We searched Ovid MEDLINE, Embase, and PsycINFO in October 2021 using synonyms and indexing terms for (1) "women" or "men" or "sex," (2) "big data" or "artificial intelligence" or "NLP", and (3) "disparities" or "differences." From 902 records, 22 studies met the inclusion criteria and were analyzed. Results demonstrate that the inclusion by sex is inconsistent and often unreported, although the inclusion of men in the included studies is disproportionately less than women. Even though AI and NLP techniques are widely applied in health research, few studies use them to take advatage of unstructured text to investigate sex-related differences or disparities. Researchers are increasingly aware of sex-based data bias, but the process to- wards correction is slow. We reflected on what would be the best practices on using big data analytics to address sex-specific biases in understanding the etiology, diagnosis, and prognosis of diseases.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
11.10
自引率
1.70%
发文量
0
期刊介绍: The Annual Review of Biomedical Data Science provides comprehensive expert reviews in biomedical data science, focusing on advanced methods to store, retrieve, analyze, and organize biomedical data and knowledge. The scope of the journal encompasses informatics, computational, artificial intelligence (AI), and statistical approaches to biomedical data, including the sub-fields of bioinformatics, computational biology, biomedical informatics, clinical and clinical research informatics, biostatistics, and imaging informatics. The mission of the journal is to identify both emerging and established areas of biomedical data science, and the leaders in these fields.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信