BanglaCalamityMMD：资源匮乏的孟加拉语多模式灾害识别的综合基准数据集

IF 4.5 1区地球科学 Q1 GEOSCIENCES, MULTIDISCIPLINARY

International journal of disaster risk reduction Pub Date : 2025-09-25 DOI:10.1016/j.ijdrr.2025.105800

Fatema Tuj Johora Faria , Mukaffi Bin Moin , Busra Kamal Rafa , Swarnajit Saha , Md. Mahfuzur Rahman , Khan Md Hasib , M.F. Mridha

{"title":"BanglaCalamityMMD：资源匮乏的孟加拉语多模式灾害识别的综合基准数据集","authors":"Fatema Tuj Johora Faria , Mukaffi Bin Moin , Busra Kamal Rafa , Swarnajit Saha , Md. Mahfuzur Rahman , Khan Md Hasib , M.F. Mridha","doi":"10.1016/j.ijdrr.2025.105800","DOIUrl":null,"url":null,"abstract":"<div><div>The abundance of social media datasets with crisis messages has greatly impacted disaster response and assessment. Extracting vital information from this data is crucial for enhancing situational awareness and enabling rapid decision-making, necessitating robust techniques to filter out misleading and irrelevant content. This study introduces a hybrid multimodal fusion technique that integrates text and image data to identify relevant disaster-related images from social media. It represents a pioneering effort in multimodal disaster identification for the Bangla language, addressing a significant gap where previous research has focused exclusively on English text. To facilitate this leap, We curated the “BanglaCalamityMMD” dataset, which includes 7,903 data points distributed across seven disaster categories such as Earthquake, Flood, Landslides, Wildfires, Tropical Storms, Droughts, and Human Damage, along with a non-disaster category. Our technique employs advanced deep learning methodologies: DisasterTextNet for text-based disaster detection, DisasterImageNet for image-based disaster categorization, and DisasterMultiFusionNet, which combines text and image modalities using fusion techniques like Early Fusion, Late Fusion, and Intermediate Fusion. The system uses Vision Transformer variations to extract visual data and pre-trained BERT models for textual insights. Our multimodal architecture (DisasterMultiFusionNet) significantly outperforms unimodal approaches. The unimodal text-based approach achieves 79.90% accuracy with mBERT, also the image-based approach reaches 78.65% accuracy using Swin Transformer. In comparison, our multimodal technique achieves 85.25% accuracy with Swin Transformer and mBERT (DisasterMultiFusionNet), showing a 5.35% improvement over the best unimodal approach. This highlights the effectiveness of our fusion technique and the reliability of our multimodal framework in enhancing disaster identification accuracy. To our knowledge, this is the first research on multimodal disaster identification in the low-resource Bangla language context.</div></div>","PeriodicalId":13915,"journal":{"name":"International journal of disaster risk reduction","volume":"130 ","pages":"Article 105800"},"PeriodicalIF":4.5000,"publicationDate":"2025-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"BanglaCalamityMMD: A comprehensive benchmark dataset for multimodal disaster identification in the low-resource Bangla language\",\"authors\":\"Fatema Tuj Johora Faria , Mukaffi Bin Moin , Busra Kamal Rafa , Swarnajit Saha , Md. Mahfuzur Rahman , Khan Md Hasib , M.F. Mridha\",\"doi\":\"10.1016/j.ijdrr.2025.105800\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The abundance of social media datasets with crisis messages has greatly impacted disaster response and assessment. Extracting vital information from this data is crucial for enhancing situational awareness and enabling rapid decision-making, necessitating robust techniques to filter out misleading and irrelevant content. This study introduces a hybrid multimodal fusion technique that integrates text and image data to identify relevant disaster-related images from social media. It represents a pioneering effort in multimodal disaster identification for the Bangla language, addressing a significant gap where previous research has focused exclusively on English text. To facilitate this leap, We curated the “BanglaCalamityMMD” dataset, which includes 7,903 data points distributed across seven disaster categories such as Earthquake, Flood, Landslides, Wildfires, Tropical Storms, Droughts, and Human Damage, along with a non-disaster category. Our technique employs advanced deep learning methodologies: DisasterTextNet for text-based disaster detection, DisasterImageNet for image-based disaster categorization, and DisasterMultiFusionNet, which combines text and image modalities using fusion techniques like Early Fusion, Late Fusion, and Intermediate Fusion. The system uses Vision Transformer variations to extract visual data and pre-trained BERT models for textual insights. Our multimodal architecture (DisasterMultiFusionNet) significantly outperforms unimodal approaches. The unimodal text-based approach achieves 79.90% accuracy with mBERT, also the image-based approach reaches 78.65% accuracy using Swin Transformer. In comparison, our multimodal technique achieves 85.25% accuracy with Swin Transformer and mBERT (DisasterMultiFusionNet), showing a 5.35% improvement over the best unimodal approach. This highlights the effectiveness of our fusion technique and the reliability of our multimodal framework in enhancing disaster identification accuracy. To our knowledge, this is the first research on multimodal disaster identification in the low-resource Bangla language context.</div></div>\",\"PeriodicalId\":13915,\"journal\":{\"name\":\"International journal of disaster risk reduction\",\"volume\":\"130 \",\"pages\":\"Article 105800\"},\"PeriodicalIF\":4.5000,\"publicationDate\":\"2025-09-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International journal of disaster risk reduction\",\"FirstCategoryId\":\"89\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2212420925006247\",\"RegionNum\":1,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"GEOSCIENCES, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International journal of disaster risk reduction","FirstCategoryId":"89","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2212420925006247","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GEOSCIENCES, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

摘要

丰富的带有危机信息的社交媒体数据集极大地影响了灾害响应和评估。从这些数据中提取重要信息对于增强态势感知和实现快速决策至关重要，需要强大的技术来过滤掉误导性和不相关的内容。本研究介绍了一种混合多模态融合技术，该技术将文本和图像数据集成在一起，以识别来自社交媒体的相关灾害相关图像。它代表了孟加拉语在多模式灾害识别方面的开创性努力，解决了以前的研究只关注英语文本的重大差距。为了实现这一飞跃，我们策划了“孟加拉灾难ymmd”数据集，其中包括7903个数据点，分布在地震、洪水、山体滑坡、野火、热带风暴、干旱和人为破坏等七大灾害类别以及一个非灾害类别。我们的技术采用了先进的深度学习方法：DisasterTextNet用于基于文本的灾难检测，DisasterImageNet用于基于图像的灾难分类，以及DisasterMultiFusionNet，它结合了文本和图像模式，使用了早期融合、晚期融合和中间融合等融合技术。该系统使用Vision Transformer变体来提取视觉数据和预训练的BERT模型来进行文本洞察。我们的多模态架构（DisasterMultiFusionNet）明显优于单模态方法。基于单峰文本的方法使用mBERT达到79.90%的准确率，基于图像的方法使用Swin Transformer达到78.65%的准确率。相比之下，我们的多模态技术在Swin Transformer和mBERT （DisasterMultiFusionNet）上实现了85.25%的准确率，比最佳单模态方法提高了5.35%。这突显了我们的融合技术的有效性和我们的多模式框架在提高灾害识别准确性方面的可靠性。据我们所知，这是第一次在资源匮乏的孟加拉语背景下进行多模式灾害识别的研究。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

BanglaCalamityMMD: A comprehensive benchmark dataset for multimodal disaster identification in the low-resource Bangla language

The abundance of social media datasets with crisis messages has greatly impacted disaster response and assessment. Extracting vital information from this data is crucial for enhancing situational awareness and enabling rapid decision-making, necessitating robust techniques to filter out misleading and irrelevant content. This study introduces a hybrid multimodal fusion technique that integrates text and image data to identify relevant disaster-related images from social media. It represents a pioneering effort in multimodal disaster identification for the Bangla language, addressing a significant gap where previous research has focused exclusively on English text. To facilitate this leap, We curated the “BanglaCalamityMMD” dataset, which includes 7,903 data points distributed across seven disaster categories such as Earthquake, Flood, Landslides, Wildfires, Tropical Storms, Droughts, and Human Damage, along with a non-disaster category. Our technique employs advanced deep learning methodologies: DisasterTextNet for text-based disaster detection, DisasterImageNet for image-based disaster categorization, and DisasterMultiFusionNet, which combines text and image modalities using fusion techniques like Early Fusion, Late Fusion, and Intermediate Fusion. The system uses Vision Transformer variations to extract visual data and pre-trained BERT models for textual insights. Our multimodal architecture (DisasterMultiFusionNet) significantly outperforms unimodal approaches. The unimodal text-based approach achieves 79.90% accuracy with mBERT, also the image-based approach reaches 78.65% accuracy using Swin Transformer. In comparison, our multimodal technique achieves 85.25% accuracy with Swin Transformer and mBERT (DisasterMultiFusionNet), showing a 5.35% improvement over the best unimodal approach. This highlights the effectiveness of our fusion technique and the reliability of our multimodal framework in enhancing disaster identification accuracy. To our knowledge, this is the first research on multimodal disaster identification in the low-resource Bangla language context.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International journal of disaster risk reduction GEOSCIENCES, MULTIDISCIPLINARYMETEOROLOGY-METEOROLOGY & ATMOSPHERIC SCIENCES

CiteScore

8.70

自引率

18.00%

发文量

688

审稿时长

79 days

期刊介绍： The International Journal of Disaster Risk Reduction (IJDRR) is the journal for researchers, policymakers and practitioners across diverse disciplines: earth sciences and their implications; environmental sciences; engineering; urban studies; geography; and the social sciences. IJDRR publishes fundamental and applied research, critical reviews, policy papers and case studies with a particular focus on multi-disciplinary research that aims to reduce the impact of natural, technological, social and intentional disasters. IJDRR stimulates exchange of ideas and knowledge transfer on disaster research, mitigation, adaptation, prevention and risk reduction at all geographical scales: local, national and international. Key topics:- -multifaceted disaster and cascading disasters -the development of disaster risk reduction strategies and techniques -discussion and development of effective warning and educational systems for risk management at all levels -disasters associated with climate change -vulnerability analysis and vulnerability trends -emerging risks -resilience against disasters. The journal particularly encourages papers that approach risk from a multi-disciplinary perspective.