{"title":"Multi-modal deep learning framework for damage detection in social media posts","authors":"Jiale Zhang, Manyu Liao, Yanping Wang, Yifan Huang, Fuyu Chen, Chiba Makiko","doi":"10.7717/peerj-cs.2262","DOIUrl":null,"url":null,"abstract":"In crisis management, quickly identifying and helping affected individuals is key, especially when there is limited information about the survivors’ conditions. Traditional emergency systems often face issues with reachability and handling large volumes of requests. Social media has become crucial in disaster response, providing important information and aiding in rescues when standard communication systems fail. Due to the large amount of data generated on social media during emergencies, there is a need for automated systems to process this information effectively and help improve emergency responses, potentially saving lives. Therefore, accurately understanding visual scenes and their meanings is important for identifying damage and obtaining useful information. Our research introduces a framework for detecting damage in social media posts, combining the Bidirectional Encoder Representations from Transformers (BERT) architecture with advanced convolutional processing. This framework includes a BERT-based network for analyzing text and multiple convolutional neural network blocks for processing images. The results show that this combination is very effective, outperforming existing methods in accuracy, recall, and F1 score. In the future, this method could be enhanced by including more types of information, such as human voices or background sounds, to improve its prediction efficiency.","PeriodicalId":3,"journal":{"name":"ACS Applied Electronic Materials","volume":null,"pages":null},"PeriodicalIF":4.3000,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Electronic Materials","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.7717/peerj-cs.2262","RegionNum":3,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
In crisis management, quickly identifying and helping affected individuals is key, especially when there is limited information about the survivors’ conditions. Traditional emergency systems often face issues with reachability and handling large volumes of requests. Social media has become crucial in disaster response, providing important information and aiding in rescues when standard communication systems fail. Due to the large amount of data generated on social media during emergencies, there is a need for automated systems to process this information effectively and help improve emergency responses, potentially saving lives. Therefore, accurately understanding visual scenes and their meanings is important for identifying damage and obtaining useful information. Our research introduces a framework for detecting damage in social media posts, combining the Bidirectional Encoder Representations from Transformers (BERT) architecture with advanced convolutional processing. This framework includes a BERT-based network for analyzing text and multiple convolutional neural network blocks for processing images. The results show that this combination is very effective, outperforming existing methods in accuracy, recall, and F1 score. In the future, this method could be enhanced by including more types of information, such as human voices or background sounds, to improve its prediction efficiency.
在危机管理中,快速识别和帮助受影响的个人是关键,尤其是在有关幸存者状况的信息有限的情况下。传统的应急系统往往面临无法联系和处理大量请求的问题。社交媒体在灾难应对中变得至关重要,它能在标准通信系统失灵时提供重要信息并协助救援。由于在紧急情况下社交媒体会产生大量数据,因此需要自动化系统来有效处理这些信息,帮助改善应急响应,从而挽救生命。因此,准确理解视觉场景及其含义对于识别损害和获取有用信息非常重要。我们的研究引入了一个用于检测社交媒体帖子中损坏情况的框架,将变压器双向编码器表示(BERT)架构与先进的卷积处理相结合。该框架包括用于分析文本的基于 BERT 的网络和用于处理图像的多个卷积神经网络块。结果表明,这种组合非常有效,在准确率、召回率和 F1 分数方面都优于现有方法。未来,这种方法还可以通过加入更多类型的信息(如人声或背景声音)来提高预测效率。