{"title":"On the Infrastructure Providers That Support Misinformation Websites","authors":"Catherine Han, Deepak Kumar, Z. Durumeric","doi":"10.1609/icwsm.v16i1.19292","DOIUrl":"https://doi.org/10.1609/icwsm.v16i1.19292","url":null,"abstract":"In this paper, we analyze the service providers that power 440 misinformation and hate sites, including hosting platforms, domain registrars, CDN providers, DDoS protection companies, advertising networks, donation processors, and e-mail providers. We find that several providers are disproportionately responsible for serving misinformation websites, most prominently Cloudflare. We further show that misinformation sites disproportionately rely on several popular ad networks and payment processors, including RevContent and Google DoubleClick. When misinformation websites are deplatformed by hosting providers, DDoS protection services, and registrars, sites nearly always resurface through alternative providers. However, anecdotally, we find that sites struggle to remain online when mainstream monetization channels are severed. We conclude with insights for infrastructure providers and researchers working to stem the spread of misinformation and hate content.","PeriodicalId":175641,"journal":{"name":"International Conference on Web and Social Media","volume":"s3-38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130160420","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Limits of Multilayer Diffusion Network Inference in Social Media Research","authors":"Yan Xia, T. H. Chen, Mikko Kivelä","doi":"10.1609/icwsm.v16i1.19365","DOIUrl":"https://doi.org/10.1609/icwsm.v16i1.19365","url":null,"abstract":"Information on social media spreads through an underlying diffusion network that connects people of common interests and opinions. This diffusion network often comprises multiple layers, each capturing the spreading dynamics of a certain type of information characterized by, for example, topic, language, or attitude. Researchers have previously proposed methods to infer these underlying multilayer diffusion networks from observed spreading patterns, but little is known about how well these methods perform across the range of realistic spreading data. In this paper, we conduct an extensive series of synthetic data experiments to systematically analyze the performance of the multilayer diffusion network inference framework, under varied network structure (e.g. density, number of layers) and information diffusion settings (e.g. cascade size, layer mixing) that are designed to mimic real-world spreading on social media. Our results show extreme performance variation of the inference framework: notably, it achieves much higher accuracy when inferring a denser diffusion network, while it fails to decompose the diffusion network correctly when most cascades in the data reach a limited audience. In demonstrating the conditions under which the inference accuracy is extremely low, our paper highlights the need to carefully evaluate the applicability of the inference before running it on real data. Practically, our results serve as a reference for this evaluation, and our publicly available implementation, which outperforms previous implementations in accuracy, supports further testing under personalized settings.","PeriodicalId":175641,"journal":{"name":"International Conference on Web and Social Media","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128535854","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
V. Veselovsky, D. Sarkar, T. J. Anderson, R. Soden
{"title":"An Automated Approach to Identifying Corporate Editing","authors":"V. Veselovsky, D. Sarkar, T. J. Anderson, R. Soden","doi":"10.1609/icwsm.v16i1.19357","DOIUrl":"https://doi.org/10.1609/icwsm.v16i1.19357","url":null,"abstract":"OpenStreetMap (OSM) is the world’s largest peer-produced geospatial project. As a freely-editable open map of the world to which anyone may contribute or make use of, the dynamics and motivations of its contributors have been the object of significant scholarship. A growing phenomena in the OSM community is the increasing contributions of paid editing teams hired by tech corporations, such as, Microsoft, Apple, and Facebook. Though corporations have long supported OSM in various ways, the recent growth of teams of paid editors raises challenges to the community’s norms and policies, which are historically oriented around contributions by individual volunteer, making it hard to track the contribution of paid editors. This research addresses a fundamental problem in approaching these concerns: understanding the scale and character of corporate editing in OSM. We use machine-learning to improve upon prior approaches to estimating this phenomena, contributing both a novel methodology as well a more robust understanding of the latest corporate editing behavior in OSM.","PeriodicalId":175641,"journal":{"name":"International Conference on Web and Social Media","volume":"132 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122123383","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nicholas Micallef, Marcelo Sandoval-Castañeda, Adir Cohen, M. Ahamad, Srijan, Kumar, Nasir D. Memon
{"title":"Cross-Platform Multimodal Misinformation: Taxonomy, Characteristics and Detection for Textual Posts and Videos","authors":"Nicholas Micallef, Marcelo Sandoval-Castañeda, Adir Cohen, M. Ahamad, Srijan, Kumar, Nasir D. Memon","doi":"10.1609/icwsm.v16i1.19323","DOIUrl":"https://doi.org/10.1609/icwsm.v16i1.19323","url":null,"abstract":"Social media posts that direct users to YouTube videos are one of the most effective techniques for spreading misinformation. However, it has been observed that such posts rarely get deleted or flagged.\u0000Since multi-modal misinformation that leads to compelling videos has more impact than using just textual content, it is important to characterize and detect such textual post and video pairs to prevent users from becoming victims of misinformation. To address this gap, we build a taxonomy of how links to YouTube videos are used on social media platforms. We then use pairs of posts and videos annotated with this taxonomy to test several classification models built using cross-platform features. Our work reveals several characteristics of post-video pairs, in terms of how posts and videos are related to each other, the type of content they share, and their collective outcome.\u0000In addition, we find that traditional approaches to misinformation detection that rely only on text from posts miss a significant number of post-video pairs that contain misinformation.\u0000More importantly, we find that to reduce the spread of misinformation via post-video pairs, classifiers would be more effective if they are designed to use data and features from multiple diverse platforms.","PeriodicalId":175641,"journal":{"name":"International Conference on Web and Social Media","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126107615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improving Wikidata with Student-Generated Concept Maps","authors":"Hayden Freedman, A. Hoek, Bill Tomlinson","doi":"10.1609/icwsm.v16i1.19285","DOIUrl":"https://doi.org/10.1609/icwsm.v16i1.19285","url":null,"abstract":"Wikidata is a publicly available, crowdsourced knowledge base that contains interlinked concepts structured for use by intelligent systems. While Wikidata has experienced rapid growth, it is far from complete and faces challenges that prevent it from being used to its full potential. In this paper, we propose a novel method for improving Wikidata by engaging undergraduate students to contribute previously missing knowledge via concept mapping assignments. Rather than allow students to edit Wikidata directly, we describe a workflow in which knowledge is constructed by students and then reviewed by an expert. We present a case study in which we deployed a workflow in a large undergraduate course about sustainability, and find that it was able to contribute a substantial number of high quality statements that persisted in and contributed previously missing knowledge to Wikidata. This work provides a preliminary workflow for improving Wikidata based on classroom assignments, as well as recommendations for how future educational projects could continue to improve Wikidata or other public knowledge bases.","PeriodicalId":175641,"journal":{"name":"International Conference on Web and Social Media","volume":"80 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117247090","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hernan Sarmiento, Felipe Bravo-Marquez, Eduardo Graells-Garrido, Bárbara Poblete
{"title":"Identifying and Characterizing New Expressions of Community Framing during Polarization","authors":"Hernan Sarmiento, Felipe Bravo-Marquez, Eduardo Graells-Garrido, Bárbara Poblete","doi":"10.1609/icwsm.v16i1.19339","DOIUrl":"https://doi.org/10.1609/icwsm.v16i1.19339","url":null,"abstract":"Chile experienced a series of important protests between October and December 2019. This social unrest, as it was called, was fueled by social inequity and radically affected the nation's status quo. A large portion of the population demanded a new Constitution and changes to the current government, whereas another part of the population rejected these social demands. This created a highly polarized scenario evidenced through online social media interactions. Analyzing controversial issues that emerge naturally from conversations in online communities can offer a more wide-scale understanding of today's political and societal discussions. Here, we analyze group polarization in social networks by studying the 2019 Chilean social unrest. Specifically, we propose an unsupervised approach for identifying and characterizing community framing (i.e., discovering and understanding polarized concepts). Our approach is based on the sequential application of community detection, topic modeling, and word embedding methods. The novelty of having an unsupervised approach is that it facilitates the performance of scalable and objective framing analyses with minimal human intervention, as it does not require prior domain or network knowledge. Using this methodology, we observe that an apparently similar conversation topic across communities can actually have completely different meanings to each group. We noted, for instance, that while an online community linked the term gente (people) with communism and terrorism, the other associated it with police and military oppression. In this direction, our work can help to contextualize real-world social issues in online platforms, describing how users discuss similar concepts with opposing views.","PeriodicalId":175641,"journal":{"name":"International Conference on Web and Social Media","volume":"146 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131695719","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Online Social Support via Avatar Communication Buffers Harmful Effects of Offline Bullying Victimization","authors":"Masanori Takano, K. Yokotani","doi":"10.1609/icwsm.v16i1.19351","DOIUrl":"https://doi.org/10.1609/icwsm.v16i1.19351","url":null,"abstract":"Online social support via avatar communication is a powerful tool for bullying victims because they often lack offline social resources. Additionally, avatar communication allows users rich nonverbal interactions (e.g., emotional expressions) while maintaining online anonymity. This study investigates the role of online social support via avatars for victims and how to facilitate such support. Accordingly, we conducted an online questionnaire survey twice on an avatar communication application, Pigg Party, regarding mental health, offline and online social support, and offline bullying victimization (participants: 3,288 (1st wave) and 758 (2nd wave)). We found that online social support via avatars supplemented insufficient offline social resources, particularly when there was a high risk of offline bullying victimization. Furthermore, we investigated how online social support is improved by ego networks using social network data from Pigg Party. We demonstrated that belonging to large and closely connected communities can enhance online social support. Our findings suggest that avatar communication applications can improve players' mental health through online social support, reinforced by facilitating ego networks.","PeriodicalId":175641,"journal":{"name":"International Conference on Web and Social Media","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115859825","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Suhavi, A. Singh, Udit Arora, Somyadeep Shrivastava, Aryaveer Singh, R. Shah, P. Kumaraguru
{"title":"Twitter-STMHD: An Extensive User-Level Database of Multiple Mental Health Disorders","authors":"Suhavi, A. Singh, Udit Arora, Somyadeep Shrivastava, Aryaveer Singh, R. Shah, P. Kumaraguru","doi":"10.1609/icwsm.v16i1.19368","DOIUrl":"https://doi.org/10.1609/icwsm.v16i1.19368","url":null,"abstract":"Social Media is equipped with the ability to track and quantify user behavior, establishing it as an appropriate resource for mental health studies. However, previous efforts in the area have been limited by the lack of data and contextually relevant information. There is a need for large-scale, well-labeled mental health datasets with fast reproducible methods to facilitate their heuristic growth. In this paper, we cater to this need by building the Twitter - Self-Reported Temporally-Contextual Mental Health Diagnosis Dataset (Twitter-STMHD), a large scale, user-level dataset grouped into 8 disorder categories and a companion class of control users. The dataset is 60% hand-annotated, which lead to the creation of high-precision self-reported diagnosis report patterns, used for the construction of the rest of the dataset. The dataset, instead of being a corpus of tweets, is a collection of user-profiles of those suffering from mental health disorders to provide a holistic view of the problem statement. By leveraging temporal information, the data for a given profile in the dataset has been collected for disease prevalence periods: onset of disorder, diagnosis and progression, along with a fourth period: COVID-19. This is the only and the largest dataset that captures the tweeting activity of users suffering from mental health disorders during the COVID-19 period.","PeriodicalId":175641,"journal":{"name":"International Conference on Web and Social Media","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128933527","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Measuring Alignment of Online Grassroots Political Communities with Political Campaigns","authors":"Cameron Raymond, Isaac Waller, Ashton Anderson","doi":"10.1609/icwsm.v16i1.19336","DOIUrl":"https://doi.org/10.1609/icwsm.v16i1.19336","url":null,"abstract":"Social media reduces barriers for the formation of large, self-organizing grassroots communities. For political campaigns this poses significant opportunities to address declining party membership, but also reputational risks and potential loss of campaign coherence. While balancing these factors is often done informally, we adopt a behavioural approach by using neural community embeddings to evaluate online communities along cultural, political, and demographic dimensions. We apply this technique to the 2020 U.S. Democratic presidential primaries and the website Reddit, providing novel insights into the important tension between campaigns and third-party actors. Using two benchmark comparison classes, we demonstrate that our embedding dimensions mirror their offline analogues, but more so the views of a candidate's supporters than the candidate's themselves. Finally, we introduce temporal aspects to our community embedding to evaluate the stability of political communities and their interrelations. These analyses serve as an exploration and application of our novel embedding methodology, and give insight into the relationship between online communities and the movements they support.","PeriodicalId":175641,"journal":{"name":"International Conference on Web and Social Media","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129022372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chenye Zhao, J. Mangat, Sujay Koujalgi, A. Squicciarini, Cornelia Caragea
{"title":"PrivacyAlert: A Dataset for Image Privacy Prediction","authors":"Chenye Zhao, J. Mangat, Sujay Koujalgi, A. Squicciarini, Cornelia Caragea","doi":"10.1609/icwsm.v16i1.19387","DOIUrl":"https://doi.org/10.1609/icwsm.v16i1.19387","url":null,"abstract":"Image privacy issues have become an important challenge as millions of images are being shared on social networking sites every day. Often due to users' lack of privacy awareness and social pressure, users' posted images reveal sensitive information and may be easily used to their detriment. To address these issues, several recent studies have proposed machine learning models to automatically identify whether an image contains private information. However, progress on this important task has been hampered by the absence of reliable, publicly available, up-to-date datasets. To this end, we introduce PrivacyAlert, a dataset developed from recent images extracted from Flickr and annotated with privacy labels (private or public). Our data collection process is based on state-of-the-art privacy taxonomy and captures a comprehensive set of image types of various sensitivity. We perform a comprehensive analysis of our dataset and report image privacy prediction results using classic and deep learning models to set the ground for future studies. Our dataset is publicly available at: https://doi.org/10.5281/zenodo.6406870.","PeriodicalId":175641,"journal":{"name":"International Conference on Web and Social Media","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127845567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}