BanglaCalamityMMD:资源匮乏的孟加拉语多模式灾害识别的综合基准数据集

IF 4.5 1区 地球科学 Q1 GEOSCIENCES, MULTIDISCIPLINARY
Fatema Tuj Johora Faria , Mukaffi Bin Moin , Busra Kamal Rafa , Swarnajit Saha , Md. Mahfuzur Rahman , Khan Md Hasib , M.F. Mridha
{"title":"BanglaCalamityMMD:资源匮乏的孟加拉语多模式灾害识别的综合基准数据集","authors":"Fatema Tuj Johora Faria ,&nbsp;Mukaffi Bin Moin ,&nbsp;Busra Kamal Rafa ,&nbsp;Swarnajit Saha ,&nbsp;Md. Mahfuzur Rahman ,&nbsp;Khan Md Hasib ,&nbsp;M.F. Mridha","doi":"10.1016/j.ijdrr.2025.105800","DOIUrl":null,"url":null,"abstract":"<div><div>The abundance of social media datasets with crisis messages has greatly impacted disaster response and assessment. Extracting vital information from this data is crucial for enhancing situational awareness and enabling rapid decision-making, necessitating robust techniques to filter out misleading and irrelevant content. This study introduces a hybrid multimodal fusion technique that integrates text and image data to identify relevant disaster-related images from social media. It represents a pioneering effort in multimodal disaster identification for the Bangla language, addressing a significant gap where previous research has focused exclusively on English text. To facilitate this leap, We curated the “BanglaCalamityMMD” dataset, which includes 7,903 data points distributed across seven disaster categories such as Earthquake, Flood, Landslides, Wildfires, Tropical Storms, Droughts, and Human Damage, along with a non-disaster category. Our technique employs advanced deep learning methodologies: DisasterTextNet for text-based disaster detection, DisasterImageNet for image-based disaster categorization, and DisasterMultiFusionNet, which combines text and image modalities using fusion techniques like Early Fusion, Late Fusion, and Intermediate Fusion. The system uses Vision Transformer variations to extract visual data and pre-trained BERT models for textual insights. Our multimodal architecture (DisasterMultiFusionNet) significantly outperforms unimodal approaches. The unimodal text-based approach achieves 79.90% accuracy with mBERT, also the image-based approach reaches 78.65% accuracy using Swin Transformer. In comparison, our multimodal technique achieves 85.25% accuracy with Swin Transformer and mBERT (DisasterMultiFusionNet), showing a 5.35% improvement over the best unimodal approach. This highlights the effectiveness of our fusion technique and the reliability of our multimodal framework in enhancing disaster identification accuracy. To our knowledge, this is the first research on multimodal disaster identification in the low-resource Bangla language context.</div></div>","PeriodicalId":13915,"journal":{"name":"International journal of disaster risk reduction","volume":"130 ","pages":"Article 105800"},"PeriodicalIF":4.5000,"publicationDate":"2025-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"BanglaCalamityMMD: A comprehensive benchmark dataset for multimodal disaster identification in the low-resource Bangla language\",\"authors\":\"Fatema Tuj Johora Faria ,&nbsp;Mukaffi Bin Moin ,&nbsp;Busra Kamal Rafa ,&nbsp;Swarnajit Saha ,&nbsp;Md. Mahfuzur Rahman ,&nbsp;Khan Md Hasib ,&nbsp;M.F. Mridha\",\"doi\":\"10.1016/j.ijdrr.2025.105800\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The abundance of social media datasets with crisis messages has greatly impacted disaster response and assessment. Extracting vital information from this data is crucial for enhancing situational awareness and enabling rapid decision-making, necessitating robust techniques to filter out misleading and irrelevant content. This study introduces a hybrid multimodal fusion technique that integrates text and image data to identify relevant disaster-related images from social media. It represents a pioneering effort in multimodal disaster identification for the Bangla language, addressing a significant gap where previous research has focused exclusively on English text. To facilitate this leap, We curated the “BanglaCalamityMMD” dataset, which includes 7,903 data points distributed across seven disaster categories such as Earthquake, Flood, Landslides, Wildfires, Tropical Storms, Droughts, and Human Damage, along with a non-disaster category. Our technique employs advanced deep learning methodologies: DisasterTextNet for text-based disaster detection, DisasterImageNet for image-based disaster categorization, and DisasterMultiFusionNet, which combines text and image modalities using fusion techniques like Early Fusion, Late Fusion, and Intermediate Fusion. The system uses Vision Transformer variations to extract visual data and pre-trained BERT models for textual insights. Our multimodal architecture (DisasterMultiFusionNet) significantly outperforms unimodal approaches. The unimodal text-based approach achieves 79.90% accuracy with mBERT, also the image-based approach reaches 78.65% accuracy using Swin Transformer. In comparison, our multimodal technique achieves 85.25% accuracy with Swin Transformer and mBERT (DisasterMultiFusionNet), showing a 5.35% improvement over the best unimodal approach. This highlights the effectiveness of our fusion technique and the reliability of our multimodal framework in enhancing disaster identification accuracy. To our knowledge, this is the first research on multimodal disaster identification in the low-resource Bangla language context.</div></div>\",\"PeriodicalId\":13915,\"journal\":{\"name\":\"International journal of disaster risk reduction\",\"volume\":\"130 \",\"pages\":\"Article 105800\"},\"PeriodicalIF\":4.5000,\"publicationDate\":\"2025-09-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International journal of disaster risk reduction\",\"FirstCategoryId\":\"89\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2212420925006247\",\"RegionNum\":1,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"GEOSCIENCES, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International journal of disaster risk reduction","FirstCategoryId":"89","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2212420925006247","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GEOSCIENCES, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

摘要

丰富的带有危机信息的社交媒体数据集极大地影响了灾害响应和评估。从这些数据中提取重要信息对于增强态势感知和实现快速决策至关重要,需要强大的技术来过滤掉误导性和不相关的内容。本研究介绍了一种混合多模态融合技术,该技术将文本和图像数据集成在一起,以识别来自社交媒体的相关灾害相关图像。它代表了孟加拉语在多模式灾害识别方面的开创性努力,解决了以前的研究只关注英语文本的重大差距。为了实现这一飞跃,我们策划了“孟加拉灾难ymmd”数据集,其中包括7903个数据点,分布在地震、洪水、山体滑坡、野火、热带风暴、干旱和人为破坏等七大灾害类别以及一个非灾害类别。我们的技术采用了先进的深度学习方法:DisasterTextNet用于基于文本的灾难检测,DisasterImageNet用于基于图像的灾难分类,以及DisasterMultiFusionNet,它结合了文本和图像模式,使用了早期融合、晚期融合和中间融合等融合技术。该系统使用Vision Transformer变体来提取视觉数据和预训练的BERT模型来进行文本洞察。我们的多模态架构(DisasterMultiFusionNet)明显优于单模态方法。基于单峰文本的方法使用mBERT达到79.90%的准确率,基于图像的方法使用Swin Transformer达到78.65%的准确率。相比之下,我们的多模态技术在Swin Transformer和mBERT (DisasterMultiFusionNet)上实现了85.25%的准确率,比最佳单模态方法提高了5.35%。这突显了我们的融合技术的有效性和我们的多模式框架在提高灾害识别准确性方面的可靠性。据我们所知,这是第一次在资源匮乏的孟加拉语背景下进行多模式灾害识别的研究。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
BanglaCalamityMMD: A comprehensive benchmark dataset for multimodal disaster identification in the low-resource Bangla language
The abundance of social media datasets with crisis messages has greatly impacted disaster response and assessment. Extracting vital information from this data is crucial for enhancing situational awareness and enabling rapid decision-making, necessitating robust techniques to filter out misleading and irrelevant content. This study introduces a hybrid multimodal fusion technique that integrates text and image data to identify relevant disaster-related images from social media. It represents a pioneering effort in multimodal disaster identification for the Bangla language, addressing a significant gap where previous research has focused exclusively on English text. To facilitate this leap, We curated the “BanglaCalamityMMD” dataset, which includes 7,903 data points distributed across seven disaster categories such as Earthquake, Flood, Landslides, Wildfires, Tropical Storms, Droughts, and Human Damage, along with a non-disaster category. Our technique employs advanced deep learning methodologies: DisasterTextNet for text-based disaster detection, DisasterImageNet for image-based disaster categorization, and DisasterMultiFusionNet, which combines text and image modalities using fusion techniques like Early Fusion, Late Fusion, and Intermediate Fusion. The system uses Vision Transformer variations to extract visual data and pre-trained BERT models for textual insights. Our multimodal architecture (DisasterMultiFusionNet) significantly outperforms unimodal approaches. The unimodal text-based approach achieves 79.90% accuracy with mBERT, also the image-based approach reaches 78.65% accuracy using Swin Transformer. In comparison, our multimodal technique achieves 85.25% accuracy with Swin Transformer and mBERT (DisasterMultiFusionNet), showing a 5.35% improvement over the best unimodal approach. This highlights the effectiveness of our fusion technique and the reliability of our multimodal framework in enhancing disaster identification accuracy. To our knowledge, this is the first research on multimodal disaster identification in the low-resource Bangla language context.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
International journal of disaster risk reduction
International journal of disaster risk reduction GEOSCIENCES, MULTIDISCIPLINARYMETEOROLOGY-METEOROLOGY & ATMOSPHERIC SCIENCES
CiteScore
8.70
自引率
18.00%
发文量
688
审稿时长
79 days
期刊介绍: The International Journal of Disaster Risk Reduction (IJDRR) is the journal for researchers, policymakers and practitioners across diverse disciplines: earth sciences and their implications; environmental sciences; engineering; urban studies; geography; and the social sciences. IJDRR publishes fundamental and applied research, critical reviews, policy papers and case studies with a particular focus on multi-disciplinary research that aims to reduce the impact of natural, technological, social and intentional disasters. IJDRR stimulates exchange of ideas and knowledge transfer on disaster research, mitigation, adaptation, prevention and risk reduction at all geographical scales: local, national and international. Key topics:- -multifaceted disaster and cascading disasters -the development of disaster risk reduction strategies and techniques -discussion and development of effective warning and educational systems for risk management at all levels -disasters associated with climate change -vulnerability analysis and vulnerability trends -emerging risks -resilience against disasters. The journal particularly encourages papers that approach risk from a multi-disciplinary perspective.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信