Kejin Wang , Volodymyr Mihunov , Nina S.N. Lam , Mingxuan Sun
{"title":"Mining socioenvironmental drivers of bias in social media rescue data for fairness-aware modeling: A case study of Hurricane Harvey","authors":"Kejin Wang , Volodymyr Mihunov , Nina S.N. Lam , Mingxuan Sun","doi":"10.1016/j.ijdrr.2025.105717","DOIUrl":null,"url":null,"abstract":"<div><div>Researchers now use AI to extract valuable insights from social media data for rapid rescue and emergency management. However, social media users are not sampled in a way that represents the entire population. This study establishes baseline Expected Rescue Request frequencies using social media data to address two critical knowledge gaps: (1) whether systematic differences in the expected rescue request frequencies during disasters exist among communities of different social-environmental characteristics, and (2) which of these characteristics reflect data representation biases and should be treated as sensitive attributes in fairness-aware AI models. Using 35 geographical, socioeconomic, and digital access variables alongside rescue request tweets from the 2017 Hurricane Harvey, a novel fairness measurement index (Rescue Request Difference) was developed and analyzed through regression and Random Forest modeling to balance predictive power, interoperability and transparency. Results showed that overserved communities exhibited higher proportions of physically and financially vulnerable populations, flood-prone geography, limited digital access, and single-family housing. Key attributes such as \"% Minority\" and \"Road density\" were identified as sensitive features requiring fairness rectification in predictive models. Limitations include reliance on a single case study and social media data biases. Future work should integrate multi-source data (e.g., 911 calls, volunteer reports) to strengthen the fairness-aware modeling framework.</div></div>","PeriodicalId":13915,"journal":{"name":"International journal of disaster risk reduction","volume":"128 ","pages":"Article 105717"},"PeriodicalIF":4.2000,"publicationDate":"2025-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International journal of disaster risk reduction","FirstCategoryId":"89","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2212420925005412","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GEOSCIENCES, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
Researchers now use AI to extract valuable insights from social media data for rapid rescue and emergency management. However, social media users are not sampled in a way that represents the entire population. This study establishes baseline Expected Rescue Request frequencies using social media data to address two critical knowledge gaps: (1) whether systematic differences in the expected rescue request frequencies during disasters exist among communities of different social-environmental characteristics, and (2) which of these characteristics reflect data representation biases and should be treated as sensitive attributes in fairness-aware AI models. Using 35 geographical, socioeconomic, and digital access variables alongside rescue request tweets from the 2017 Hurricane Harvey, a novel fairness measurement index (Rescue Request Difference) was developed and analyzed through regression and Random Forest modeling to balance predictive power, interoperability and transparency. Results showed that overserved communities exhibited higher proportions of physically and financially vulnerable populations, flood-prone geography, limited digital access, and single-family housing. Key attributes such as "% Minority" and "Road density" were identified as sensitive features requiring fairness rectification in predictive models. Limitations include reliance on a single case study and social media data biases. Future work should integrate multi-source data (e.g., 911 calls, volunteer reports) to strengthen the fairness-aware modeling framework.
期刊介绍:
The International Journal of Disaster Risk Reduction (IJDRR) is the journal for researchers, policymakers and practitioners across diverse disciplines: earth sciences and their implications; environmental sciences; engineering; urban studies; geography; and the social sciences. IJDRR publishes fundamental and applied research, critical reviews, policy papers and case studies with a particular focus on multi-disciplinary research that aims to reduce the impact of natural, technological, social and intentional disasters. IJDRR stimulates exchange of ideas and knowledge transfer on disaster research, mitigation, adaptation, prevention and risk reduction at all geographical scales: local, national and international.
Key topics:-
-multifaceted disaster and cascading disasters
-the development of disaster risk reduction strategies and techniques
-discussion and development of effective warning and educational systems for risk management at all levels
-disasters associated with climate change
-vulnerability analysis and vulnerability trends
-emerging risks
-resilience against disasters.
The journal particularly encourages papers that approach risk from a multi-disciplinary perspective.