Lucas G.S. Félix, Washington Cunha, Claudio M.V. de Andrade, Marcos André Gonçalves, Jussara M. Almeida
{"title":"Why are you traveling? Inferring trip profiles from online reviews and domain-knowledge","authors":"Lucas G.S. Félix, Washington Cunha, Claudio M.V. de Andrade, Marcos André Gonçalves, Jussara M. Almeida","doi":"10.1016/j.osnem.2024.100296","DOIUrl":"10.1016/j.osnem.2024.100296","url":null,"abstract":"<div><div>This paper addresses the task of inferring trip profiles (TPs), which consists of determining the profile of travelers engaged in a particular trip given a set of possible categories. TPs may include working trips, leisure journeys with friends, or family vacations. Travelers with different TPs typically have varied plans regarding destinations and timing. TP inference may provide significant insights for numerous tourism-related services, such as geo-recommender systems and tour planning. We focus on TP inference using TripAdvisor, a prominent tourism-centric social media platform, as our data source. Our goal is to evaluate how effectively we can automatically discern the TP from a user review on this platform. A user review encompasses both textual feedback and domain-specific data (such as a user’s previous visits to the location), which are crucial for accurately characterizing the trip. To achieve this, we assess various feature sets (including text and domain-specific) and implement advanced machine learning models, such as neural Transformers and open-source Large Language Models (Llama 2, Bloom). We examine two variants of the TP inference task—binary and multi-class. Surprisingly, our findings reveal that combining domain-specific features with TF-IDF-based representation in an LGBM model performs as well as more complex Transformer and LLM models, while being much more efficient and interpretable.</div></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":"45 ","pages":"Article 100296"},"PeriodicalIF":0.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143095300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Riccardo Cantini, Cristian Cosentino, Fabrizio Marozzo, Domenico Talia, Paolo Trunfio
{"title":"Harnessing prompt-based large language models for disaster monitoring and automated reporting from social media feedback","authors":"Riccardo Cantini, Cristian Cosentino, Fabrizio Marozzo, Domenico Talia, Paolo Trunfio","doi":"10.1016/j.osnem.2024.100295","DOIUrl":"10.1016/j.osnem.2024.100295","url":null,"abstract":"<div><div>In recent years, social media has emerged as one of the main platforms for real-time reporting of issues during disasters and catastrophic events. While great strides have been made in collecting such information, there remains an urgent need to improve user reports’ automation, aggregation, and organization to streamline various tasks, including rescue operations, resource allocation, and communication with the press. This paper introduces an innovative methodology that leverages the power of prompt-based Large Language Models (LLMs) to strengthen disaster response and management. By analyzing large volumes of user-generated content, our methodology identifies issues reported by citizens who have experienced a disastrous event, such as damaged buildings, broken gas pipelines, and flooding. It also localizes all posts containing references to geographic information in the text, allowing for aggregation of posts that occurred nearby. By leveraging these localized citizen-reported issues, the methodology generates insightful reports full of essential information for emergency services, news agencies, and other interested parties. Extensive experimentation on large datasets validates the accuracy and efficiency of our methodology in classifying posts, detecting sub-events, and producing real-time reports. These findings highlight the practical value of prompt-based LLMs in disaster response, emphasizing their flexibility and adaptability in delivering timely insights that support more effective interventions.</div></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":"45 ","pages":"Article 100295"},"PeriodicalIF":0.0,"publicationDate":"2024-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142702794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"HaRNaT - A dynamic hashtag recommendation system using news","authors":"Divya Gupta, Shampa Chakraverty","doi":"10.1016/j.osnem.2024.100294","DOIUrl":"10.1016/j.osnem.2024.100294","url":null,"abstract":"<div><div>Microblogging platforms such as <em>X</em> and <em>Mastadon</em> have evolved into significant data sources, where the Hashtag Recommendation System (HRS) is being devised to automate the recommendation of hashtags for user queries. We propose a context-sensitive, Machine Learning based HRS named <em>HaRNaT</em>, that strategically leverages news articles to identify pertinent keywords and subjects related to a query. It interprets the fresh context of a query and tracks the evolving dynamics of hashtags to evaluate their relevance in the present context. In contrast to prior methods that primarily rely on microblog content for hashtag recommendation, <em>HaRNaT</em> mines contextually related microblogs and assesses the relevance of co-occurring hashtags with news information. To accomplish this, it evaluates hashtag features, including pertinence, popularity among users, and association with other hashtags. In performance evaluation of <em>HaRNaT</em> trained on these features demonstrates a macro-averaged precision of 84% with Naive Bayes and 80% with Logistic Regression. Compared to <em>Hashtagify</em>- a hashtag search engine, <em>HaRNaT</em> offers a dynamically evolving set of hashtags.</div></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":"45 ","pages":"Article 100294"},"PeriodicalIF":0.0,"publicationDate":"2024-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142702795","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"How does user-generated content on Social Media affect stock predictions? A case study on GameStop","authors":"Antonino Ferraro , Giancarlo Sperlì","doi":"10.1016/j.osnem.2024.100293","DOIUrl":"10.1016/j.osnem.2024.100293","url":null,"abstract":"<div><div>One of the main challenges in the financial market concerns the forecasting of stock behavior, which plays a key role in supporting the financial decisions of investors. In recent years, the large amount of available financial data and the heterogeneous contextual information led researchers to investigate data-driven models using Artificial Intelligence (AI)-based approaches for forecasting stock prices. Recent methodologies focus mainly on analyzing participants from Reddit without considering other social media and how their combination affects the stock market, which remains an open challenge. In this paper, we combine financial data and textual user-generated information, which are provided as input to various deep learning models, to develop a stock forecasting system. The main novelties of the proposal concern the design of a multi-modal approach combining historical stock prices and sentiment scores extracted by different Online Social Networks (OSNs), also unveiling possible correlations about heterogeneous information evaluated during the GameStop squeeze. In particular, we have examined several AI-based models and investigated the impact of textual data inferred from well-known Online Social Networks (<em>i.e.</em>, Reddit and Twitter) on stock market behavior by conducting a case study on GameStop. Although users’ dynamic opinions on social networks may have a detrimental impact on the stock prediction task, our investigation has demonstrated the usefulness of assessing user-generated content inferred from various OSNs on the market forecasting problem.</div></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":"43 ","pages":"Article 100293"},"PeriodicalIF":0.0,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142653576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Milo Z. Trujillo, Laurent Hébert-Dufresne, James Bagrow
{"title":"Measuring centralization of online platforms through size and interconnection of communities","authors":"Milo Z. Trujillo, Laurent Hébert-Dufresne, James Bagrow","doi":"10.1016/j.osnem.2024.100292","DOIUrl":"10.1016/j.osnem.2024.100292","url":null,"abstract":"<div><div>Decentralization of online social platforms offers a variety of potential benefits, including divesting of moderator and administrator authority among a wider population, allowing a variety of communities with differing social standards to coexist, and making the platform more resilient to technical or social attack. However, a platform offering a decentralized architecture does not guarantee that users will use it in a decentralized way, and measuring the centralization of socio-technical networks is not an easy task. In this paper we introduce a method of characterizing inter-community influence, to measure the impact that removing a community would have on the remainder of a platform. Our approach provides a careful definition of “centralization” appropriate in bipartite user-community socio-technical networks, and demonstrates the inadequacy of more trivial methods for interrogating centralization such as examining the distribution of community sizes. We use this method to compare the structure of five socio-technical platforms, and find that even decentralized platforms like Mastodon are far more centralized than any synthetic networks used for comparison. We discuss how this method can be used to identify when a platform is more centralized than it initially appears, either through inherent social pressure like assortative preferential attachment, or through astroturfing by platform administrators, and how this knowledge can inform platform governance and user trust.</div></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":"43 ","pages":"Article 100292"},"PeriodicalIF":0.0,"publicationDate":"2024-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142530114","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Giulio Corsi , Elizabeth Seger , Sean Ó hÉigeartaigh
{"title":"Crowdsourcing the Mitigation of disinformation and misinformation: The case of spontaneous community-based moderation on Reddit","authors":"Giulio Corsi , Elizabeth Seger , Sean Ó hÉigeartaigh","doi":"10.1016/j.osnem.2024.100291","DOIUrl":"10.1016/j.osnem.2024.100291","url":null,"abstract":"<div><div>Community-based content moderation, an approach that utilises user-generated knowledge to shape the ranking and display of online content, is recognised as a potential tool in combating disinformation and misinformation. This study examines this phenomenon on Reddit, which employs a platform-wide content ranking system based on user upvotes and downvotes. By empowering users to influence content visibility, Reddit's system serves as a naturally occurring community moderation mechanism, providing an opportunity to analyse how users engage with this system. Focusing on discussions related to climate change, we observe that in this domain, low-credibility content is spontaneously moderated by Reddit users, although the magnitude of this effect varies across Subreddits. We also identify temporal fluctuations in content removal rates, indicating dynamic and context-dependent patterns influenced by platform policies and socio-political factors. These findings highlight the potential of community-based moderation in mitigating online false information, offering valuable insights for the development of robust social media moderation frameworks.</div></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":"43 ","pages":"Article 100291"},"PeriodicalIF":0.0,"publicationDate":"2024-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142530113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"GASCOM: Graph-based Attentive Semantic Context Modeling for Online Conversation Understanding","authors":"Vibhor Agarwal , Yu Chen , Nishanth Sastry","doi":"10.1016/j.osnem.2024.100290","DOIUrl":"10.1016/j.osnem.2024.100290","url":null,"abstract":"<div><div>Online conversation understanding is an important yet challenging NLP problem which has many useful applications (e.g., hate speech detection). However, online conversations typically unfold over a series of posts and replies to those posts, forming a tree structure within which individual posts may refer to semantic context from elsewhere in the tree. Such semantic cross-referencing makes it difficult to understand a single post by itself; yet considering the entire conversation tree is not only difficult to scale but can also be misleading as a single conversation may have several distinct threads or points, not all of which are relevant to the post being considered. In this paper, we propose a <strong>G</strong>raph-based <strong>A</strong>ttentive <strong>S</strong>emantic <strong>CO</strong>ntext <strong>M</strong>odeling (GASCOM) framework for online conversation understanding. Specifically, we design two novel algorithms that utilize both the graph structure of the online conversation as well as the semantic information from individual posts for retrieving relevant context nodes from the whole conversation. We further design a <em>token-level</em> multi-head graph attention mechanism to pay different attentions to different tokens from different selected context utterances for fine-grained conversation context modelling. Using this semantic conversational context, we re-examine two well-studied problems: polarity prediction and hate speech detection. Our proposed framework significantly outperforms state-of-the-art methods on both tasks, improving macro-F1 scores by 4.5% for polarity prediction and by 5% for hate speech detection. The GASCOM context weights also enhance interpretability.</div></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":"43 ","pages":"Article 100290"},"PeriodicalIF":0.0,"publicationDate":"2024-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142441667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The influence of coordinated behavior on toxicity","authors":"Edoardo Loru , Matteo Cinelli , Maurizio Tesconi , Walter Quattrociocchi","doi":"10.1016/j.osnem.2024.100289","DOIUrl":"10.1016/j.osnem.2024.100289","url":null,"abstract":"<div><div>In the intricate landscape of social media, genuine content dissemination may be altered by a number of threats. Coordinated Behavior (CB), defined as orchestrated efforts by entities to deceive or mislead users about their identity and intentions, emerges as a tactic to exploit or manipulate online discourse. This study delves into the relationship between CB and toxic conversation on X (formerly known as Twitter). Using a dataset of 11 million tweets from 1 million users preceding the 2019 UK general election, we show that users displaying CB typically disseminate less harmful content, irrespective of political affiliation. However, distinct toxicity patterns emerge among different coordinated cohorts. Compared to their non-CB counterparts, CB participants show marginally higher toxicity levels only when considering their original posts. We further show the effects of CB-driven toxic content on non-CB users, gauging its impact based on political leanings. Our findings suggest that CB only has a limited impact on the toxicity of digital discourse.</div></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":"43 ","pages":"Article 100289"},"PeriodicalIF":0.0,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142428185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Friend2User : A new CNN based method for user network and content embedding","authors":"Amal Rekik, Salma Jamoussi","doi":"10.1016/j.osnem.2024.100288","DOIUrl":"10.1016/j.osnem.2024.100288","url":null,"abstract":"<div><div>Nowadays, social networks have become an integral part of modern society, significantly influencing individuals worldwide due to their extensive reach. Consequently, analyzing the data disseminated within these networks in order to identify online communities presents a major challenge for researchers in the data mining field. To address this challenge, we propose, in this paper, a novel deep user embedding framework for community extraction on social networks. Our method leverages the capability of Convolutional Neural Networks (CNNs) to produce abstract representations of users that preserve the semantic information in the data. Specifically, our approach considers both the profile content and the network structure, harnessing the power of unsupervised CNNs. The key concept underlying our proposal is that each user is represented not only by their own content but also by the content of their close friends. We employ a recursive CNN to integrate neighboring users’ content, thereby generating concise and informative user embeddings. The empirical findings obtained by our method demonstrate the effectiveness of our proposed user embeddings in efficiently detecting communities within social networks, particularly in the context of cybersecurity.</div></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":"43 ","pages":"Article 100288"},"PeriodicalIF":0.0,"publicationDate":"2024-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142327284","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Cross-community affinity: A polarization measure for multi-community networks","authors":"Sreeja Nair , Adriana Iamnitchi","doi":"10.1016/j.osnem.2024.100280","DOIUrl":"10.1016/j.osnem.2024.100280","url":null,"abstract":"<div><p>This article introduces a heterophily-based metric for assessing polarization in social networks when different opposing ideological communities coexist. The proposed metric measures polarization at the node level and is based on a node’s affinity for other communities. Node-level values can then be aggregated at the community, network, or any intermediate level, resulting in a more comprehensive map of polarization. We looked at our metric on the Polblogs network, the White Helmets Twitter interaction network with two communities, and the VoterFraud2020 domain network with five communities. Additionally, we evaluated our metric on different sets of synthetic graphs to confirm that it yields low polarization scores, as expected. We employed three ways to build synthetic networks: synthetic labeling, dK-series, and network models, in order to assess how the proposed measure behaves to various topologies and network features. Then, we compared our metric to two commonly used polarization metrics, Guerra’s boundary polarization and the random walk controversy score. We also examined how our suggested metric correlates with two network metrics: assortativity and modularity.</p></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":"43 ","pages":"Article 100280"},"PeriodicalIF":0.0,"publicationDate":"2024-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142021566","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}