Tânia Fontes;Francisco Murços;Eduardo Carneiro;Joel Ribeiro;Rosaldo J. F. Rossetti
{"title":"Leveraging Social Media as a Source of Mobility Intelligence: An NLP-Based Approach","authors":"Tânia Fontes;Francisco Murços;Eduardo Carneiro;Joel Ribeiro;Rosaldo J. F. Rossetti","doi":"10.1109/OJITS.2023.3308210","DOIUrl":null,"url":null,"abstract":"This work presents a deep learning framework for analyzing urban mobility by extracting knowledge from messages collected from Twitter. The framework, which is designed to handle large-scale data and adapt automatically to new contexts, comprises three main modules: data collection and system configuration, data analytics, and aggregation and visualization. The text data is pre-processed using NLP techniques to remove informal words, slang, and misspellings. A pre-trained, unsupervised word embedding model, BERT, is used to classify travel-related tweets using a unigram approach with three dictionaries of travel-related target words: small, medium, and big. Public opinion is evaluated using VADER to classify travel-related tweets according to their sentiments. The mobility of three major cities was assessed: London, Melbourne, and New York. The framework demonstrates consistently high average performance, with a Precision of 0.80 for text classification and 0.77 for sentiment analysis. The framework can aggregate sparse information from social media and provide updated information in near real-time with high spatial resolution, enabling easy identification of traffic-related events. The framework is helpful for transportation decision-makers in operational control, tactical-strategic planning, and policy evaluation. For example, it can be used to improve the management of resources during traffic congestion or emergencies.","PeriodicalId":100631,"journal":{"name":"IEEE Open Journal of Intelligent Transportation Systems","volume":"4 ","pages":"663-681"},"PeriodicalIF":4.6000,"publicationDate":"2023-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/8784355/9999144/10229505.pdf","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Open Journal of Intelligent Transportation Systems","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10229505/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 1
Abstract
This work presents a deep learning framework for analyzing urban mobility by extracting knowledge from messages collected from Twitter. The framework, which is designed to handle large-scale data and adapt automatically to new contexts, comprises three main modules: data collection and system configuration, data analytics, and aggregation and visualization. The text data is pre-processed using NLP techniques to remove informal words, slang, and misspellings. A pre-trained, unsupervised word embedding model, BERT, is used to classify travel-related tweets using a unigram approach with three dictionaries of travel-related target words: small, medium, and big. Public opinion is evaluated using VADER to classify travel-related tweets according to their sentiments. The mobility of three major cities was assessed: London, Melbourne, and New York. The framework demonstrates consistently high average performance, with a Precision of 0.80 for text classification and 0.77 for sentiment analysis. The framework can aggregate sparse information from social media and provide updated information in near real-time with high spatial resolution, enabling easy identification of traffic-related events. The framework is helpful for transportation decision-makers in operational control, tactical-strategic planning, and policy evaluation. For example, it can be used to improve the management of resources during traffic congestion or emergencies.