{"title":"Representing and Determining Argumentative Relevance in Online Discussions: A General Approach","authors":"Zhen Guo, Munindar P. Singh","doi":"10.1609/icwsm.v17i1.22146","DOIUrl":"https://doi.org/10.1609/icwsm.v17i1.22146","url":null,"abstract":"Understanding an online argumentative discussion is essential for understanding users' opinions on a topic and their underlying reasoning. A key challenge in determining completeness and persuasiveness of argumentative discussions is to assess how arguments under a topic are connected in a logical and coherent manner. Online argumentative discussions, in contrast to essays or face-to-face communication, challenge techniques for judging argument relevance because online discussions involve multiple participants and often exhibit incoherence in reasoning and inconsistencies in writing style. \u0000\u0000We define relevance as the logical and topical connections between small texts representing argument fragments in online discussions. We provide a corpus comprising pairs of sentences, labeled with argumentative relevance between the sentences in each pair. We propose a computational approach relying on content reduction and a Siamese neural network architecture for modeling argumentative connections and determining argumentative relevance between texts. \u0000\u0000Experimental results indicate that our approach is effective in measuring relevance between arguments, and outperforms strong and well-adopted baselines.\u0000Further analysis demonstrates the benefit of using our argumentative relevance encoding on a downstream task, predicting how impactful an online comment is to certain topic, comparing to encoding that does not consider logical connection.","PeriodicalId":175641,"journal":{"name":"International Conference on Web and Social Media","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130288178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"HateMM: A Multi-Modal Dataset for Hate Video Classification","authors":"Mithun Das, R. Raj, Punyajoy Saha, Binny Mathew, Manish Gupta, Animesh Mukherjee","doi":"10.48550/arXiv.2305.03915","DOIUrl":"https://doi.org/10.48550/arXiv.2305.03915","url":null,"abstract":"Hate speech has become one of the most significant issues in modern society, having implications in both the online and the offline world. Due to this, hate speech research has recently gained a lot of traction. However, most of the work has primarily focused on text media with relatively little work on images and even lesser on videos. Thus, early stage automated video moderation techniques are needed to handle the videos that are being uploaded to keep the platform safe and healthy. With a view to detect and remove hateful content from the video sharing platforms, our work focuses on hate video detection using multi-modalities. To this end, we curate ~43 hours of videos from BitChute and manually annotate them as hate or non-hate, along with the frame spans which could explain the labelling decision. To collect the relevant videos we harnessed search keywords from hate lexicons. We observe various cues in images and audio of hateful videos. Further, we build deep learning multi-modal models to classify the hate videos and observe that using all the modalities of the videos improves the overall hate speech detection performance (accuracy=0.798, macro F1-score=0.790) by ~5.7% compared to the best uni-modal model in terms of macro F1 score. In summary, our work takes the first step toward understanding and modeling hateful videos on video hosting platforms such as BitChute.","PeriodicalId":175641,"journal":{"name":"International Conference on Web and Social Media","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131867648","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Reddit in the Time of COVID","authors":"V. Veselovsky, Ashton Anderson","doi":"10.48550/arXiv.2304.10777","DOIUrl":"https://doi.org/10.48550/arXiv.2304.10777","url":null,"abstract":"When the COVID-19 pandemic hit, much of life moved online. Platforms of all types reported surges of activity, and people remarked on the various important functions that online platforms suddenly fulfilled. However, researchers lack a rigorous understanding of the pandemic's impacts on social platforms---and whether they were temporary or long-lasting. We present a conceptual framework for studying the large-scale evolution of social platforms and apply it to the study of Reddit's history, with a particular focus on the COVID-19 pandemic. We study platform evolution through two key dimensions: structure vs. content and macro- vs. micro-level analysis. Structural signals help us quantify how much behavior changed, while content analysis clarifies exactly how it changed. Applying these at the macro-level illuminates platform-wide changes, while at the micro-level we study impacts on individual users. We illustrate the value of this approach by showing the extraordinary and ordinary changes Reddit went through during the pandemic. First, we show that typically when rapid growth occurs, it is driven by a few concentrated communities and within a narrow slice of language use. However, Reddit's growth throughout COVID-19 was spread across disparate communities and languages. Second, all groups were equally affected in their change of interest, but veteran users tended to invoke COVID-related language more than newer users. Third, the new wave of users that arrived following COVID-19 was fundamentally different from previous cohorts of new users in terms of interests, activity, and likelihood of staying active on the platform. These findings provide a more rigorous understanding of how an online platform changed during the global pandemic.","PeriodicalId":175641,"journal":{"name":"International Conference on Web and Social Media","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127223732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Daniel Hickey, Matheus Schmitz, D. Fessler, P. Smaldino, Goran Muric, K. Burghardt
{"title":"Auditing Elon Musk's Impact on Hate Speech and Bots","authors":"Daniel Hickey, Matheus Schmitz, D. Fessler, P. Smaldino, Goran Muric, K. Burghardt","doi":"10.48550/arXiv.2304.04129","DOIUrl":"https://doi.org/10.48550/arXiv.2304.04129","url":null,"abstract":"On October 27th, 2022, Elon Musk purchased Twitter, becoming its new CEO and firing many top executives in the process. Musk listed fewer restrictions on content moderation and removal of spam bots among his goals for the platform. Given findings of prior research on moderation and hate speech in online communities, the promise of less strict content moderation poses the concern that hate will rise on Twitter. We examine the levels of hate speech and prevalence of bots before and after Musk's acquisition of the platform. We find that hate speech rose dramatically upon Musk purchasing Twitter and the prevalence of most types of bots increased, while the prevalence of astroturf bots decreased.","PeriodicalId":175641,"journal":{"name":"International Conference on Web and Social Media","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124432140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Truth Social Dataset","authors":"Patrick Gérard, Nicholas Botzer, Tim Weninger","doi":"10.48550/arXiv.2303.11240","DOIUrl":"https://doi.org/10.48550/arXiv.2303.11240","url":null,"abstract":"Formally announced to the public following former President Donald Trump’s bans and suspensions from mainstream social networks in early 2022 following his role in the January 6 Capitol Riots, Truth Social was launched as an ``alternative'' social media platform that claims to be a refuge for free speech, offering a platform for those disaffected by the content moderation policies of then existing, mainstream social networks. The subsequent rise of Truth Social has been driven largely by hard-line supporters of the former president as well as those affected by the content moderation of other social networks. These distinct qualities combined with the its status as the main mouthpiece of the former president positions Truth Social as a particularly influential social media platform and give rise to several research questions. However, outside of a handful of news reports, little is known about the new social media platform partially due to a lack of well-curated data. In the current work, we describe a dataset of over 823,000 posts to Truth Social and and social network with over 454,000 distinct users. In addition to the dataset itself, we also present some basic analysis of its content, certain temporal features, and its network.","PeriodicalId":175641,"journal":{"name":"International Conference on Web and Social Media","volume":"154 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114651601","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Wiki-based Communities of Interest: Demographics and Outliers","authors":"Hiba Arnaout, S. Razniewski, Jeff Z. Pan","doi":"10.48550/arXiv.2303.09189","DOIUrl":"https://doi.org/10.48550/arXiv.2303.09189","url":null,"abstract":"In this paper, we release data about demographic information and outliers of communities of interest. Identified from Wiki-based sources, mainly Wikidata, the data covers 7.5k communities, e.g., members of the White House Coronavirus Task Force, and 345k subjects, e.g., Deborah Birx. We describe the statistical inference methodology adopted to mine such data. We release subject-centric and group-centric datasets in JSON format, as well as a browsing interface. Finally, we forsee three areas where this dataset can be useful: in social sciences research, it provides a resource for demographic analyses; in web-scale collaborative encyclopedias, it serves as an edit recommender to fill knowledge gaps; and in web search, it offers lists of salient statements about queried subjects for higher user engagement. The dataset can be accessed at: https://doi.org/10.5281/zenodo.7410436","PeriodicalId":175641,"journal":{"name":"International Conference on Web and Social Media","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123901947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Andrew Beers, Joseph S. Schafer, Ian Kennedy, Morgan Wack, Emma S. Spiro, Kate Starbird
{"title":"Followback Clusters, Satellite Audiences, and Bridge Nodes: Coengagement Networks for the 2020 US Election","authors":"Andrew Beers, Joseph S. Schafer, Ian Kennedy, Morgan Wack, Emma S. Spiro, Kate Starbird","doi":"10.48550/arXiv.2303.04620","DOIUrl":"https://doi.org/10.48550/arXiv.2303.04620","url":null,"abstract":"The 2020 United States (US) presidential election was — and has continued to be — the focus of pervasive and persistent mis- and disinformation spreading through our media ecosystems, including social media. This event has driven the collection and analysis of large, directed social network datasets, but such datasets can resist intuitive understanding. In such large datasets, the overwhelming number of nodes and edges present in typical representations create visual artifacts, such as densely overlapping edges and tightly-packed formations of low-degree nodes, which obscure many features of more practical interest. We apply a method, coengagement transformations, to convert such networks of social data into tractable images. Intuitively, this approach allows for parameterized network visualizations that make shared audiences of engaged viewers salient to viewers. Using the interpretative capabilities of this method, we perform an extensive case study of the 2020 United States presidential election on Twitter, contributing an empirical analysis of coengagement. By creating and contrasting different networks at different parameter sets, we define and characterize several structures in this discourse network, including bridging accounts, satellite audiences, and followback communities. We discuss the importance and implications of these empirical network features in this context. In addition, we release open-source code for creating coengagement networks from Twitter and other structured interaction data.","PeriodicalId":175641,"journal":{"name":"International Conference on Web and Social Media","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122027434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"#RoeOverturned: Twitter Dataset on the Abortion Rights Controversy","authors":"Rong-Ching Chang, Ashwin Rao, Qiankun Zhong, Magdalena E. Wojcieszak, Kristina Lerman","doi":"10.48550/arXiv.2302.01439","DOIUrl":"https://doi.org/10.48550/arXiv.2302.01439","url":null,"abstract":"On June 24, 2022, the United States Supreme Court overturned landmark rulings made in its 1973 verdict in Roe v. Wade. The justices by way of a majority vote in Dobbs v. Jackson Women's Health Organization, decided that abortion wasn't a constitutional right and returned the issue of abortion to the elected representatives. This decision triggered multiple protests and debates across the US, especially in the context of the midterm elections in November 2022. Given that many citizens use social media platforms to express their views and mobilize for collective action, and given that online debate provides tangible effects on public opinion, political participation, news media coverage, and the political decision-making, it is crucial to understand online discussions surrounding this topic. Toward this end, we present the first large-scale Twitter dataset collected on the abortion rights debate in the United States. We present a set of 74M tweets systematically collected over the course of one year from January 1, 2022 to January 6, 2023.","PeriodicalId":175641,"journal":{"name":"International Conference on Web and Social Media","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121190444","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Team Resilience under Shock: An Empirical Analysis of GitHub Repositories during Early COVID-19 Pandemic","authors":"Xuan Lu, W. Ai, Yixin Wang, Q. Mei","doi":"10.48550/arXiv.2301.12326","DOIUrl":"https://doi.org/10.48550/arXiv.2301.12326","url":null,"abstract":"While many organizations have shifted to working remotely during the COVID-19 pandemic, how the remote workforce and the remote teams are influenced by and would respond to this and future shocks remain largely unknown. Software developers have relied on remote collaborations long before the pandemic, working in virtual teams (GitHub repositories). The dynamics of these repositories through the pandemic provide a unique opportunity to understand how remote teams react under shock. This work presents a systematic analysis. \u0000\u0000We measure the overall effect of the early pandemic on public GitHub repositories by comparing their sizes and productivity with the counterfactual outcomes forecasted as if there were no pandemic. We find that the productivity level and the number of active members of these teams vary significantly during different periods of the pandemic. We then conduct a finer-grained investigation and study the heterogeneous effects of the shock on individual teams. We find that the resilience of a team is highly correlated to certain properties of the team before the pandemic. Through a bootstrapped regression analysis, we reveal which types of teams are robust or fragile to the shock.","PeriodicalId":175641,"journal":{"name":"International Conference on Web and Social Media","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115461790","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Salvatore Giorgi, Ke Zhao, Alexander H. Feng, Lara J. Martin
{"title":"Author as Character and Narrator: Deconstructing Personal Narratives from the r/AmITheAsshole Reddit Community","authors":"Salvatore Giorgi, Ke Zhao, Alexander H. Feng, Lara J. Martin","doi":"10.48550/arXiv.2301.08104","DOIUrl":"https://doi.org/10.48550/arXiv.2301.08104","url":null,"abstract":"In the r/AmITheAsshole subreddit, people anonymously share first person narratives that contain some moral dilemma or conflict and ask the community to judge who is at fault (i.e., who is \"the asshole\"). These first person narratives are, in general, a unique storytelling domain where the author is not only the narrator (the person telling the story) but is also a character (the person living the story) and, thus, the author has two distinct voices presented in the story. In this study, we identify linguistic and narrative features associated with the author as the character or as a narrator. We use these features to answer the following questions: (1) what makes an asshole character and (2) what makes an asshole narrator? We extract both Author-as-Character features (e.g., demographics, narrative event chain, and emotional arc) and Author-as-Narrator features (i.e., the style and emotion of the story as a whole) in order to identify which aspects of the narrative are correlated with the final moral judgment. Our work shows that \"assholes\" as Characters frame themselves as lacking agency with a more positive personal arc, while \"assholes\" as Narrators will tell emotional and opinionated stories.","PeriodicalId":175641,"journal":{"name":"International Conference on Web and Social Media","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125969583","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}