利用深度学习模型从 Twitter 上检测意大利滑坡灾害信息

IF 4 Q2 ENVIRONMENTAL SCIENCES

Geoenvironmental Disasters Pub Date : 2024-07-30 DOI:10.1186/s40677-024-00279-4

Rachele Franceschini, Ascanio Rosi, Filippo Catani, Nicola Casagli

{"title":"利用深度学习模型从 Twitter 上检测意大利滑坡灾害信息","authors":"Rachele Franceschini, Ascanio Rosi, Filippo Catani, Nicola Casagli","doi":"10.1186/s40677-024-00279-4","DOIUrl":null,"url":null,"abstract":"Mass media are a new and important source of information for any natural disaster, mass emergency, pandemic, economic or political event, or extreme weather event affecting one or more communities in a country. Several techniques have been developed for data mining in social media for many natural events, but few of them have been applied to the automatic extraction of landslide events. In this study, Twitter has been investigated to detect data about landslide events in Italian-language. The main aim is to obtain an automatic text classification on the basis of information about natural hazards. The text classification for landslide events in Italian-language has still not been applied to detect this type of natural hazard. Over 13,000 data were extracted within Twitter considering five keywords referring to landslide events. The dataset was classified manually, providing a solid base for applying deep learning. The combination of BERT + CNN has been chosen for text classification and two different pre-processing approaches and bert-model have been applied. BERT-multicase + CNN without preprocessing archived the highest values of accuracy, equal to 96% and AUC of 0.96. Two advantages resulted from this studio: the Italian-language classified dataset for landslide events fills that present gap of analysing natural events using Twitter. BERT + CNN was trained to detect this information and proved to be an excellent classifier for the Italian language for landslide events.","PeriodicalId":37025,"journal":{"name":"Geoenvironmental Disasters","volume":"46 1","pages":""},"PeriodicalIF":4.0000,"publicationDate":"2024-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Detecting information from Twitter on landslide hazards in Italy using deep learning models\",\"authors\":\"Rachele Franceschini, Ascanio Rosi, Filippo Catani, Nicola Casagli\",\"doi\":\"10.1186/s40677-024-00279-4\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Mass media are a new and important source of information for any natural disaster, mass emergency, pandemic, economic or political event, or extreme weather event affecting one or more communities in a country. Several techniques have been developed for data mining in social media for many natural events, but few of them have been applied to the automatic extraction of landslide events. In this study, Twitter has been investigated to detect data about landslide events in Italian-language. The main aim is to obtain an automatic text classification on the basis of information about natural hazards. The text classification for landslide events in Italian-language has still not been applied to detect this type of natural hazard. Over 13,000 data were extracted within Twitter considering five keywords referring to landslide events. The dataset was classified manually, providing a solid base for applying deep learning. The combination of BERT + CNN has been chosen for text classification and two different pre-processing approaches and bert-model have been applied. BERT-multicase + CNN without preprocessing archived the highest values of accuracy, equal to 96% and AUC of 0.96. Two advantages resulted from this studio: the Italian-language classified dataset for landslide events fills that present gap of analysing natural events using Twitter. BERT + CNN was trained to detect this information and proved to be an excellent classifier for the Italian language for landslide events.\",\"PeriodicalId\":37025,\"journal\":{\"name\":\"Geoenvironmental Disasters\",\"volume\":\"46 1\",\"pages\":\"\"},\"PeriodicalIF\":4.0000,\"publicationDate\":\"2024-07-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Geoenvironmental Disasters\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1186/s40677-024-00279-4\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENVIRONMENTAL SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Geoenvironmental Disasters","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1186/s40677-024-00279-4","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}

引用次数: 0

摘要

对于影响一个国家一个或多个社区的任何自然灾害、大规模紧急事件、流行病、经济或政治事件或极端天气事件，大众媒体都是一个新的重要信息来源。针对许多自然事件的社交媒体数据挖掘已开发出多种技术，但其中很少有技术被应用于滑坡事件的自动提取。在这项研究中，对 Twitter 进行了调查，以检测意大利语的山体滑坡事件数据。主要目的是在自然灾害信息的基础上获得自动文本分类。意大利语中的山体滑坡事件文本分类仍未应用于检测此类自然灾害。我们从 Twitter 中提取了 13,000 多条数据，并考虑了与滑坡事件相关的五个关键词。该数据集经过人工分类，为应用深度学习提供了坚实的基础。文本分类选择了 BERT + CNN 的组合，并应用了两种不同的预处理方法和 BERT 模型。BERT-multicase+CNN（无预处理）的准确率最高，达到 96%，AUC 为 0.96。该工作室有两个优势：意大利语的滑坡事件分类数据集填补了目前使用 Twitter 分析自然事件的空白。BERT + CNN 经过训练，可以检测到这些信息，并证明是一种出色的意大利语山体滑坡事件分类器。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Detecting information from Twitter on landslide hazards in Italy using deep learning models

Mass media are a new and important source of information for any natural disaster, mass emergency, pandemic, economic or political event, or extreme weather event affecting one or more communities in a country. Several techniques have been developed for data mining in social media for many natural events, but few of them have been applied to the automatic extraction of landslide events. In this study, Twitter has been investigated to detect data about landslide events in Italian-language. The main aim is to obtain an automatic text classification on the basis of information about natural hazards. The text classification for landslide events in Italian-language has still not been applied to detect this type of natural hazard. Over 13,000 data were extracted within Twitter considering five keywords referring to landslide events. The dataset was classified manually, providing a solid base for applying deep learning. The combination of BERT + CNN has been chosen for text classification and two different pre-processing approaches and bert-model have been applied. BERT-multicase + CNN without preprocessing archived the highest values of accuracy, equal to 96% and AUC of 0.96. Two advantages resulted from this studio: the Italian-language classified dataset for landslide events fills that present gap of analysing natural events using Twitter. BERT + CNN was trained to detect this information and proved to be an excellent classifier for the Italian language for landslide events.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Geoenvironmental Disasters Social Sciences-Geography, Planning and Development

CiteScore

8.90

自引率

6.20%

发文量

期刊介绍： Geoenvironmental Disasters is an international journal with a focus on multi-disciplinary applied and fundamental research and the effects and impacts on infrastructure, society and the environment of geoenvironmental disasters triggered by various types of geo-hazards (e.g. earthquakes, volcanic activity, landslides, tsunamis, intensive erosion and hydro-meteorological events). The integrated study of Geoenvironmental Disasters is an emerging and composite field of research interfacing with areas traditionally within civil engineering, earth sciences, atmospheric sciences and the life sciences. It centers on the interactions within and between the Earth''s ground, air and water environments, all of which are affected by climate, geological, morphological and anthropological processes; and biological and ecological cycles. Disasters are dynamic forces which can change the Earth pervasively, rapidly, or abruptly, and which can generate lasting effects on the natural and built environments. The journal publishes research papers, case studies and quick reports of recent geoenvironmental disasters, review papers and technical reports of various geoenvironmental disaster-related case studies. The focus on case studies and quick reports of recent geoenvironmental disasters helps to advance the practical understanding of geoenvironmental disasters and to inform future research priorities; they are a major component of the journal. The journal aims for the rapid publication of research papers at a high scientific level. The journal welcomes proposals for special issues reflecting the trends in geoenvironmental disaster reduction and monothematic issues. Researchers and practitioners are encouraged to submit original, unpublished contributions.