Lucas E.B. Skora , Helen C.M. Senefonte , Myriam Regattieri Delgado , Ricardo Lüders , Thiago H. Silva
{"title":"Comparing global tourism flows measured by official census and social sensing","authors":"Lucas E.B. Skora , Helen C.M. Senefonte , Myriam Regattieri Delgado , Ricardo Lüders , Thiago H. Silva","doi":"10.1016/j.osnem.2022.100204","DOIUrl":"https://doi.org/10.1016/j.osnem.2022.100204","url":null,"abstract":"<div><p>A better understanding of the behavior of tourists is strategic for improving services in the competitive and important economic segment of global tourism. Critical studies in the literature often explore the issue using traditional data, such as questionnaires or interviews. Traditional approaches provide precious information; however, they impose challenges to obtaining large-scale data, making it hard to study worldwide patterns. Location-based social networks (LBSNs) can potentially mitigate such issues due to the relatively low cost of acquiring large amounts of behavioral data. Nevertheless, before using such data for studying tourists’ behavior, it is necessary to verify whether the information adequately reveals the behavior measured with traditional data — considered the ground truth. Thus, the present work investigates in which countries the global tourism network measured with an LBSN agreeably reflects the behavior estimated by the World Tourism Organization using traditional methods. Although we could find exceptions, the results suggest that, for most countries, LBSN data can satisfactorily represent the behavior studied. We have an indication that, in countries with high correlations between results obtained from both datasets, LBSN data can be used in research regarding the mobility of the tourists in the studied context.</p></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":"29 ","pages":"Article 100204"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"137156824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Utilizing subjectivity level to mitigate identity term bias in toxic comments classification","authors":"Zhixue Zhao, Ziqi Zhang, Frank Hopfgartner","doi":"10.1016/j.osnem.2022.100205","DOIUrl":"10.1016/j.osnem.2022.100205","url":null,"abstract":"<div><p><span><span><span>Toxic comment classification models are often found biased towards identity terms, i.e., terms characterizing a specific group of people such as “Muslim” and “black”. Such bias is commonly reflected in </span>false positive predictions, i.e., non-toxic comments with identity terms. In this work, we propose a novel approach to debias the model in toxic comment classification, leveraging the notion of subjectivity level of a comment and the presence of identity terms. We hypothesize that toxic comments containing identity terms are more likely to be expressions of subjective feelings or opinions. Therefore, the subjectivity level of a comment containing identity terms can be helpful for classifying toxic comments and mitigating the identity term bias. To implement this idea, we propose a model based on </span>BERT and study two different methods of measuring the subjectivity level. The first method uses a lexicon-based tool. The second method is based on the idea of calculating the embedding similarity between a comment and a relevant Wikipedia text of the identity term in the comment. We thoroughly evaluate our method on an extensive collection of four datasets collected from different </span>social media platforms<span>. Our results show that: (1) our models that incorporate both features of subjectivity and identity terms consistently outperform strong SOTA baselines, with our best performing model achieving an improvement in F1 of 4.75% over a Twitter dataset; (2) our idea of measuring subjectivity based on the similarity to the relevant Wikipedia text is very effective on toxic comment classification as our model using this has achieved the best performance on 3 out of 4 datasets while obtaining comparative performance on the remaining dataset. We further test our method on RoBERTa to evaluate the generality of our method and the results show the biggest improvement in F1 of up to 1.29% (on a dataset from a white supremacist online forum).</span></p></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":"29 ","pages":"Article 100205"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117258021","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Usman Anjum, Vladimir Zadorozhny, Prashant Krishnamurthy
{"title":"Localization of Unidentified Events with Raw Microblogging Data","authors":"Usman Anjum, Vladimir Zadorozhny, Prashant Krishnamurthy","doi":"10.1016/j.osnem.2022.100209","DOIUrl":"10.1016/j.osnem.2022.100209","url":null,"abstract":"<div><p><span><span>Event localization is the task of finding the location of an event. Commonly, event localization using microblogging services, like Twitter, use con- tents of the messages and the </span>geographical information<span> associated with the messages. In this paper, we propose a novel approach called SPARE (SPAtial REconstruction) that bypasses the need for geographical or semantic information to localize tweets. We assume there are reference coordinates at known locations that scrape the microblog (tweet) counts in time and space (circular regions around the reference coordinate). The counts of tweets are aggregated which are then disaggregated to identify event patterns. The change in counts of tweets would be indicative of an event pattern. We show, using real data, that the change in counts of tweets is manifested as peaks. The peaks from multiple reference coordinates can be used as an input to </span></span>trilateration techniques to pinpoint the location of an event. We introduce metrics to identify the quality of disaggregation of fine-grained data and examine techniques like filtering to improve accuracy of event location. The experimental results show that our method can identify the location of an event with high accuracy.</p></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":"29 ","pages":"Article 100209"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128221538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Rumour spread minimization in social networks: A source-ignorant approach","authors":"Ahmad Zareie, Rizos Sakellariou","doi":"10.1016/j.osnem.2022.100206","DOIUrl":"10.1016/j.osnem.2022.100206","url":null,"abstract":"<div><p>The spread of rumours in social networks has become a significant challenge in recent years. Blocking so-called critical edges, that is, edges that have a significant role in the spreading process, has attracted lots of attention as a means to minimize the spread of rumours. Although the detection of the sources of rumour may help identify critical edges this has an overhead that source-ignorant approaches are trying to eliminate. Several source-ignorant edge blocking methods have been proposed which mostly determine critical edges on the basis of centrality. Taking into account additional features of edges (beyond centrality) may help determine what edges to block more accurately. In this paper, a new source-ignorant method is proposed to identify a set of critical edges by considering for each edge the impact of blocking and the influence of the nodes connected to the edge. Experimental results demonstrate that the proposed method can identify critical edges more accurately in comparison to other source-ignorant methods.</p></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":"29 ","pages":"Article 100206"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2468696422000106/pdfft?md5=5c46e8ade686686c561918b3c01408b9&pid=1-s2.0-S2468696422000106-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130196186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rafael M.O. Cruz , Woshington V. de Sousa , George D.C. Cavalcanti
{"title":"Selecting and combining complementary feature representations and classifiers for hate speech detection","authors":"Rafael M.O. Cruz , Woshington V. de Sousa , George D.C. Cavalcanti","doi":"10.1016/j.osnem.2021.100194","DOIUrl":"https://doi.org/10.1016/j.osnem.2021.100194","url":null,"abstract":"<div><p><span><span>Hate speech is a major issue in social networks due to the high volume of data generated daily. Recent works demonstrate the usefulness of machine learning (ML) in dealing with the nuances required to distinguish between hateful posts from just sarcasm or offensive language. Many ML solutions for hate speech detection have been proposed by either changing how features are extracted from the text or the </span>classification algorithm<span><span><span> employed. However, most works consider only one type of feature extraction and classification algorithm. This work argues that a combination of multiple feature extraction techniques and different classification models is needed. We propose a framework to analyze the relationship between multiple feature extraction and </span>classification techniques to understand how they complement each other. The framework is used to select a subset of complementary techniques to compose a robust </span>multiple classifiers system<span> (MCS) for hate speech detection. The experimental study considering four hate speech classification datasets demonstrates that the proposed framework is a promising methodology for analyzing and designing high-performing MCS for this task. MCS system obtained using the proposed framework significantly outperforms the combination of all models and the homogeneous and heterogeneous selection heuristics, demonstrating the importance of having a proper selection scheme. Source code, figures and dataset splits can be found in the GitHub repository: </span></span></span><span>https://github.com/Menelau/Hate-Speech-MCS</span><svg><path></path></svg>.</p></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":"28 ","pages":"Article 100194"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91737144","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Leonardo Tonetto , Malintha Adikari , Nitinder Mohan , Aaron Yi Ding , Jörg Ott
{"title":"Contact duration: Intricacies of human mobility","authors":"Leonardo Tonetto , Malintha Adikari , Nitinder Mohan , Aaron Yi Ding , Jörg Ott","doi":"10.1016/j.osnem.2021.100196","DOIUrl":"https://doi.org/10.1016/j.osnem.2021.100196","url":null,"abstract":"<div><p>Human mobility shapes our daily lives, our urban environment and even the trajectory of a global pandemic. While various aspects of human mobility and inter-personal contact duration have already been studied separately, little is known about how these two key aspects of our daily lives are fundamentally connected. Better understanding of such interconnected human behaviors is crucial for studying infectious diseases, as well as opportunistic content forwarding. To address these deficiencies, we conducted a study on a mobile social network of human mobility and contact duration, using data from 71 persons based on GPS and Bluetooth logs for 2 months in 2018. We augment these data with location APIs, enabling a finer granular characterization of the users’ mobility in addition to contact patterns. We model stops durations to reveal how time-unbounded-stops (<em>e.g.</em>, bars or restaurants) follow a log-normal distribution while time-bounded-stops (<em>e.g.</em>, offices, hotels) follow a power-law distribution. Furthermore, our analysis reveals contact duration adheres to a log-normal distribution, which we use to model the duration of contacts as a function of the duration of stays. We further extend our understanding of contact duration during trips by modeling these times as a Weibull distribution whose parameters are a function of trip length. These results could better inform models for information or epidemic spreading, helping guide the future design of network protocols as well as policy decisions.</p></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":"28 ","pages":"Article 100196"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2468696421000720/pdfft?md5=3f4081e0dafc13110ea3b0ba03ef6285&pid=1-s2.0-S2468696421000720-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91696282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Billy Spann , Esther Mead , Maryam Maleki , Nitin Agarwal , Therese Williams
{"title":"Applying diffusion of innovations theory to social networks to understand the stages of adoption in connective action campaigns","authors":"Billy Spann , Esther Mead , Maryam Maleki , Nitin Agarwal , Therese Williams","doi":"10.1016/j.osnem.2022.100201","DOIUrl":"https://doi.org/10.1016/j.osnem.2022.100201","url":null,"abstract":"<div><p><span>This research proposes a conceptual framework for determining the adoption trajectory of information diffusion in connective action campaigns. This approach reveals whether an information campaign is accelerating, reached critical mass, or decelerating during its life cycle. The experimental approach taken in this study builds on the diffusion of innovations theory, critical mass theory, and previous s-shaped production function research to provide ideas for modeling future connective action campaigns. Most social science research on connective action has taken a qualitative approach. There are limited quantitative studies, but most focus on statistical validation of the qualitative approach, such as surveys, or only focus on one aspect of connective action. In this study, we extend the social science research on connective action theory by applying a mixed-method computational analysis to examine the affordances and features offered through </span>online social networks (OSNs) and then present a new method to quantify the emergence of these action networks. Using the s-curves revealed through plotting the information campaigns usage, we apply a diffusion of innovations lens to the analysis to categorize users into different stages of adoption of information campaigns. We then categorize the users in each campaign by examining their affordance and interdependence relationships by assigning retweets, mentions, and original tweets to the type of relationship they exhibit. The contribution of this analysis provides a foundation for mathematical characterization of connective action signatures, and further, offers policymakers insights about campaigns as they evolve. To evaluate our framework, we present a comprehensive analysis of COVID-19 Twitter data. Establishing this theoretical framework will help researchers develop predictive models to more accurately model campaign dynamics.</p></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":"28 ","pages":"Article 100201"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90019833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lynnette Hui Xian Ng, Dawn C. Robertson, Kathleen M. Carley
{"title":"Stabilizing a supervised bot detection algorithm: How much data is needed for consistent predictions?","authors":"Lynnette Hui Xian Ng, Dawn C. Robertson, Kathleen M. Carley","doi":"10.1016/j.osnem.2022.100198","DOIUrl":"https://doi.org/10.1016/j.osnem.2022.100198","url":null,"abstract":"<div><p>Social media bots have been characterized in their use in digital activism and information manipulation, due to their roles in information diffusion. The detection of bots has been a major task within the field of social media computation, and many datasets and bot detection algorithms have been developed. With these algorithms, the bot score stability is key in estimating the impact of bots on the diffusion of information. Within several experiments on Twitter agents, we quantify the amount of data required for consistent bot predictions and analyze agent bot classification behavior. Through this study, we developed a methodology to establish parameters for stabilizing the bot probability score through threshold, temporal and volume analysis, eventually quantifying suitable threshold values for bot classification (i.e. whether the agent is a bot or not) and reasonable data collection size (i.e. number of days of tweets or number of tweets) for stable scores and bot classification.</p></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":"28 ","pages":"Article 100198"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2468696422000027/pdfft?md5=879d4a241d8634d464a12524eaf23546&pid=1-s2.0-S2468696422000027-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91696283","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Valerio Arnaboldi , Marco Conti , Andrea Passarella , Robin I.M. Dunbar
{"title":"Erratum to Online Social Networks and information diffusion: The role of ego networks: Online Social Networks and Media, Volume 1 (June 2017), Pages 44-55","authors":"Valerio Arnaboldi , Marco Conti , Andrea Passarella , Robin I.M. Dunbar","doi":"10.1016/j.osnem.2021.100184","DOIUrl":"10.1016/j.osnem.2021.100184","url":null,"abstract":"","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":"27 ","pages":"Article 100184"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2468696421000628/pdfft?md5=7b90bb651c421f310f601ebc13af3388&pid=1-s2.0-S2468696421000628-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131380779","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Understanding and identifying the use of emotes in toxic chat on Twitch","authors":"Jaeheon Kim , Donghee Yvette Wohn , Meeyoung Cha","doi":"10.1016/j.osnem.2021.100180","DOIUrl":"10.1016/j.osnem.2021.100180","url":null,"abstract":"<div><p>The latest advances in NLP (natural language processing) have led to the launch of the much needed machine-driven toxic chat detection. Nevertheless, people continuously find new forms of hateful expressions that are easily identified by humans, but not by machines. One such common expression is the mix of text and emotes, a type of visual toxic chat that is increasingly used to evade algorithmic moderation and a trend that is an under-studied aspect of the problem of online toxicity. This research analyzes chat conversations from the popular streaming platform Twitch to understand the varied types of visual toxic chat. Emotes were sometimes used to replace a letter, seek attention, or for emotional expression. We created a labeled dataset that contains 29,721 cases of emotes replacing letters. Based on the dataset, we built a neural network classifier and identified visual toxic chat that would otherwise be undetected through traditional methods and caught an additional 1.3% examples of toxic chat out of 15 million chat utterances.</p></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":"27 ","pages":"Article 100180"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2468696421000598/pdfft?md5=74d9b0d4cdd5859c36ea8a0c200c176d&pid=1-s2.0-S2468696421000598-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123624066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}