{"title":"Understanding Demographic Bias and Representation in Social Media Health Data","authors":"Nina L. Cesare, Christan Earl Grant, E. Nsoesie","doi":"10.1145/3328413.3328415","DOIUrl":null,"url":null,"abstract":"Text, images, geotags and other data from social media sites lend researchers a unique window into population health trends and disease spread. While these data provide the opportunity to track and measure health outcomes across geographic regions, over extended periods of time, and through complex social networks, they also present challenges. Most notably, these data carry significant biases due to demographic differences in who chooses to use each platform, and what they choose to share. While several publications have discussed the limitations of leveraging social media data for public health research, the amount of literature systematically investigating their demographic bias and exploring mitigation strategies is limited and ripe for interdisciplinary contributions. In this discussion paper, we highlight that understanding the strengths and limitations of these data sources would enable a rigorous assessment of their usefulness for public health research and provide a means for quantifying uncertainty in research findings.","PeriodicalId":102426,"journal":{"name":"Companion Publication of the 10th ACM Conference on Web Science","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Companion Publication of the 10th ACM Conference on Web Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3328413.3328415","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11
Abstract
Text, images, geotags and other data from social media sites lend researchers a unique window into population health trends and disease spread. While these data provide the opportunity to track and measure health outcomes across geographic regions, over extended periods of time, and through complex social networks, they also present challenges. Most notably, these data carry significant biases due to demographic differences in who chooses to use each platform, and what they choose to share. While several publications have discussed the limitations of leveraging social media data for public health research, the amount of literature systematically investigating their demographic bias and exploring mitigation strategies is limited and ripe for interdisciplinary contributions. In this discussion paper, we highlight that understanding the strengths and limitations of these data sources would enable a rigorous assessment of their usefulness for public health research and provide a means for quantifying uncertainty in research findings.