EPJ Data SciencePub Date : 2024-04-10DOI: 10.1140/epjds/s13688-024-00464-3
Matteo Serafino, Zhenkun Zhou, José S. Andrade, Alexandre Bovet, Hernán A. Makse
{"title":"Suspended accounts align with the Internet Research Agency misinformation campaign to influence the 2016 US election","authors":"Matteo Serafino, Zhenkun Zhou, José S. Andrade, Alexandre Bovet, Hernán A. Makse","doi":"10.1140/epjds/s13688-024-00464-3","DOIUrl":"https://doi.org/10.1140/epjds/s13688-024-00464-3","url":null,"abstract":"<p>The ongoing debate surrounding the impact of the Internet Research Agency’s (IRA) social media campaign during the 2016 U.S. presidential election has largely overshadowed the involvement of other actors. Our analysis brings to light a substantial group of suspended Twitter users, outnumbering the IRA user group by a factor of 60, who align with the ideologies of the IRA campaign. Our study demonstrates that this group of suspended Twitter accounts significantly influenced individuals categorized as undecided or weak supporters, potentially with the aim of swaying their opinions, as indicated by Granger causality.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"49 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140564095","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
EPJ Data SciencePub Date : 2024-04-04DOI: 10.1140/epjds/s13688-024-00469-y
{"title":"Unveiling the silent majority: stance detection and characterization of passive users on social media using collaborative filtering and graph convolutional networks","authors":"","doi":"10.1140/epjds/s13688-024-00469-y","DOIUrl":"https://doi.org/10.1140/epjds/s13688-024-00469-y","url":null,"abstract":"<h3>Abstract</h3> <p>Social Media (SM) has become a popular medium for individuals to share their opinions on various topics, including politics, social issues, and daily affairs. During controversial events such as political elections, active users often proclaim their stance and try to persuade others to support them. However, disparities in participation levels can lead to misperceptions and cause analysts to misjudge the support for each side. For example, current models usually rely on content production and overlook a vast majority of civically engaged users who passively consume information. These “silent users” can significantly impact the democratic process despite being less vocal. Accounting for the stances of this silent majority is critical to improving our reliance on SM to understand and measure social phenomena. Thus, this study proposes and evaluates a new approach for silent users’ stance prediction based on collaborative filtering and Graph Convolutional Networks, which exploits multiple relationships between users and topics. Furthermore, our method allows us to describe users with different stances and online behaviors. We demonstrate its validity using real-world datasets from two related political events. Specifically, we examine user attitudes leading to the Chilean constitutional referendums in 2020 and 2022 through extensive Twitter datasets. In both datasets, our model outperforms the baselines by over 9% at the edge- and the user level. Thus, our method offers an improvement in effectively quantifying the support and creating a multidimensional understanding of social discussions on SM platforms, especially during polarizing events.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"32 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140563977","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
EPJ Data SciencePub Date : 2024-04-02DOI: 10.1140/epjds/s13688-024-00468-z
{"title":"Science as exploration in a knowledge landscape: tracing hotspots or seeking opportunity?","authors":"","doi":"10.1140/epjds/s13688-024-00468-z","DOIUrl":"https://doi.org/10.1140/epjds/s13688-024-00468-z","url":null,"abstract":"<h3>Abstract</h3> <p>The selection of research topics by scientists can be viewed as an exploration process conducted by individuals with cognitive limitations traversing a complex cognitive landscape influenced by both individual and social factors. While existing theoretical investigations have provided valuable insights, the intricate and multifaceted nature of modern science hinders the implementation of empirical experiments. This study leverages advancements in Geographic Information System (GIS) techniques to investigate the patterns and dynamic mechanisms of topic-transition among scientists. By constructing the knowledge space across 6 large-scale disciplines, we depict the trajectories of scientists’ topic transitions within this space, measuring the flow and distance of research regions across different sub-spaces. Our findings reveal a predominantly conservative pattern of topic transition at the individual level, with scientists primarily exploring local knowledge spaces. Furthermore, simulation modeling analysis identifies research intensity, driven by the concentration of scientists within a specific region, as the key facilitator of topic transition. Conversely, the knowledge distance between fields serves as a significant barrier to exploration. Notably, despite potential opportunities for breakthrough discoveries at the intersection of subfields, empirical evidence suggests that these opportunities do not exert a strong pull on scientists, leading them to favor familiar research areas. Our study provides valuable insights into the exploration dynamics of scientific knowledge production, highlighting the influence of individual cognition, social factors, and the intrinsic structure of the knowledge landscape itself. These findings offer a framework for understanding and potentially shaping the course of scientific progress.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"2013 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140564049","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
EPJ Data SciencePub Date : 2024-03-26DOI: 10.1140/epjds/s13688-024-00462-5
{"title":"Unveiling public perception of AI ethics: an exploration on Wikipedia data","authors":"","doi":"10.1140/epjds/s13688-024-00462-5","DOIUrl":"https://doi.org/10.1140/epjds/s13688-024-00462-5","url":null,"abstract":"<h3>Abstract</h3> <p>Artificial Intelligence (AI) technologies have exposed more and more ethical issues while providing services to people. It is challenging for people to realize the occurrence of AI ethical issues in most cases. The lower the public awareness, the more difficult it is to address AI ethical issues. Many previous studies have explored public reactions and opinions on AI ethical issues through questionnaires and social media platforms like Twitter. However, these approaches primarily focus on categorizing popular topics and sentiments, overlooking the public’s potential lack of knowledge underlying these issues. Few studies revealed the holistic knowledge structure of AI ethical topics and the relations among the subtopics. As the world’s largest online encyclopedia, Wikipedia encourages people to jointly contribute and share their knowledge by adding new topics and following a well-accepted hierarchical structure. Through public viewing and editing, Wikipedia serves as a proxy for knowledge transmission. This study aims to analyze how the public comprehend the body of knowledge of AI ethics. We adopted the community detection approach to identify the hierarchical community of the AI ethical topics, and further extracted the AI ethics-related entities, which are proper nouns, organizations, and persons. The findings reveal that the primary topics at the top-level community, most pertinent to AI ethics, predominantly revolve around knowledge-based and ethical issues. Examples include transitions from Information Theory to Internet Copyright Infringement. In summary, this study contributes to three points, (1) to present the holistic knowledge structure of AI ethics, (2) to evaluate and improve the existing body of knowledge of AI ethics, (3) to enhance public perception of AI ethics to mitigate the risks associated with AI technologies.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"101 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140300351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
EPJ Data SciencePub Date : 2024-03-26DOI: 10.1140/epjds/s13688-024-00461-6
Manuel Pratelli, Marinella Petrocchi, Fabio Saracco, Rocco De Nicola
{"title":"Online disinformation in the 2020 U.S. election: swing vs. safe states","authors":"Manuel Pratelli, Marinella Petrocchi, Fabio Saracco, Rocco De Nicola","doi":"10.1140/epjds/s13688-024-00461-6","DOIUrl":"https://doi.org/10.1140/epjds/s13688-024-00461-6","url":null,"abstract":"<p>For U.S. presidential elections, most states use the so-called winner-take-all system, in which the state’s presidential electors are awarded to the winning political party in the state after a popular vote phase, regardless of the actual margin of victory. Therefore, election campaigns are especially intense in states where there is no clear direction on which party will be the winning party. These states are often referred to as <i>swing states</i>. To measure the impact of such an election law on the campaigns, we analyze the Twitter activity surrounding the 2020 US preelection debate, with a particular focus on the spread of disinformation. We find that about 88% of the online traffic was associated with swing states. In addition, the sharing of links to unreliable news sources is significantly more prevalent in tweets associated with swing states: in this case, untrustworthy tweets are predominantly generated by automated accounts. Furthermore, we observe that the debate is mostly led by two main communities, one with a predominantly Republican affiliation and the other with accounts of different political orientations. Most of the disinformation comes from the former.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"33 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140300355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
EPJ Data SciencePub Date : 2024-03-22DOI: 10.1140/epjds/s13688-024-00463-4
Mohamed Amine Bouzaghrane, Hassan Obeid, Marta González, Joan Walker
{"title":"Human mobility reshaped? Deciphering the impacts of the Covid-19 pandemic on activity patterns, spatial habits, and schedule habits","authors":"Mohamed Amine Bouzaghrane, Hassan Obeid, Marta González, Joan Walker","doi":"10.1140/epjds/s13688-024-00463-4","DOIUrl":"https://doi.org/10.1140/epjds/s13688-024-00463-4","url":null,"abstract":"<p>Despite the historically documented regularity in human mobility patterns, the relaxation of spatial and temporal constraints, brought by the widespread adoption of telecommuting and e-commerce during the COVID-19 pandemic, as well as a growing desire for flexible work arrangements in a post-pandemic work, indicates a potential reshaping of these patterns. In this paper, we investigate the multifaceted impacts of relaxed spatio-temporal constraints on human mobility, using well-established metrics from the travel behavior literature. Further, we introduce a novel metric for schedule regularity, accounting for specific day-of-week characteristics that previous approaches overlooked. Building on the large body of literature on the impacts of COVID-19 on human mobility, we make use of passively tracked Point of Interest (POI) data for approximately 21,700 smartphone users in the US, and analyze data between January 2020 and September 2022 to answer two key questions: (1) has the COVID-19 pandemic and its associated relaxation of spatio-temporal activity patterns reshaped the different aspects of human mobility, and (2) have we achieved a state of stable post-pandemic “new normal”? We hypothesize that the relaxation of the spatiotemporal constraints around key activities will result in people exhibiting less regular schedules. Findings reveal a complex landscape: while some mobility indicators have reverted to pre-pandemic norms, such as trip frequency and travel distance, others, notably at-home dwell-time, persist at altered levels, suggesting a recalibration rather than a return to past behaviors. Most notably, our analysis reveals a paradox: despite the documented large-scale shift towards flexible work arrangements, schedule habits have strengthened rather than relaxed, defying our initial hypotheses and highlighting a desire for regularity. The study’s results contribute to a deeper understanding of the post-pandemic “new normal”, offering key insights on how multiple facets of travel behavior were reshaped, if at all, by the COVID-19 pandemic, and will help inform transportation planning in a post-pandemic world.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"122 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140197516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
EPJ Data SciencePub Date : 2024-03-21DOI: 10.1140/epjds/s13688-024-00459-0
{"title":"Identification of suspicious behavior through anomalies in the tracking data of fishing vessels","authors":"","doi":"10.1140/epjds/s13688-024-00459-0","DOIUrl":"https://doi.org/10.1140/epjds/s13688-024-00459-0","url":null,"abstract":"<h3>Abstract</h3> <p>Automated positioning devices can generate large datasets with information on the movement of humans, animals and objects, revealing patterns of movement, hot spots and overlaps among others. However, in the case of Automated Information Systems (AIS), attached to vessels, observed strange behaviors in the tracking datasets may come from intentional manipulation of the electronic devices. Thus, the analysis of anomalies can provide valuable information on suspicious behavior. Here, we analyze anomalies of fishing vessel trajectories obtained with the Automatic Identification System. The map of silent anomalies, those that occur when positioning data are absent for more than 24 hours, shows that they are most likely to occur closer to land, with 87.1% of anomalies observed within 100 km of the coast. This behavior suggests the potential of identifying silence anomalies as a proxy for illegal activities. With the increasing availability of high-resolution positioning of vessels and the development of powerful statistical analytical tools, we provide hints on the automatic detection of illegal activities that may help optimize the management of fishing resources.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"3 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140197548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Human mobility prediction with causal and spatial-constrained multi-task network","authors":"Zongyuan Huang, Shengyuan Xu, Menghan Wang, Hansi Wu, Yanyan Xu, Yaohui Jin","doi":"10.1140/epjds/s13688-024-00460-7","DOIUrl":"https://doi.org/10.1140/epjds/s13688-024-00460-7","url":null,"abstract":"<p>Modeling human mobility helps to understand how people are accessing resources and physically contacting with each other in cities, and thus contributes to various applications such as urban planning, epidemic control, and location-based advertisement. Next location prediction is one decisive task in individual human mobility modeling and is usually viewed as sequence modeling, solved with Markov or RNN-based methods. However, the existing models paid little attention to the logic of individual travel decisions and the reproducibility of the collective behavior of population. To this end, we propose a Causal and Spatial-constrained Long and Short-term Learner (CSLSL) for next location prediction. CSLSL utilizes a causal structure based on multi-task learning to explicitly model the “<i>when</i>→<i>what</i>→<i>where</i>”, a.k.a. “<i>time</i>→<i>activity</i>→<i>location</i>” decision logic. We next propose a spatial-constrained loss function as an auxiliary task, to ensure the consistency between the predicted and actual spatial distribution of travelers’ destinations. Moreover, CSLSL adopts modules named Long and Short-term Capturer (LSC) to learn the transition regularities across different time spans. Extensive experiments on three real-world datasets show promising performance improvements of CSLSL over baselines and confirm the effectiveness of introducing the causality and consistency constraints. The implementation is available at https://github.com/urbanmobility/CSLSL.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"62 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140170433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
EPJ Data SciencePub Date : 2024-03-12DOI: 10.1140/epjds/s13688-024-00455-4
{"title":"Evolving demographics: a dynamic clustering approach to analyze residential segregation in Berlin","authors":"","doi":"10.1140/epjds/s13688-024-00455-4","DOIUrl":"https://doi.org/10.1140/epjds/s13688-024-00455-4","url":null,"abstract":"<h3>Abstract</h3> <p>This paper examines the phenomenon of residential segregation in Berlin over time using a dynamic clustering analysis approach. Previous research has examined the phenomenon of residential segregation in Berlin at a high spatial and temporal aggregation and statically, i.e. not over time. We propose a methodology to investigate the existence of clusters of residential areas according to migration background, age group, gender, and socio-economic dimension over time. To this end, we have developed a sequential mixed methods approach that includes a multivariate kernel density estimation technique to estimate the density of subpopulations and a dynamic cluster analysis to discover spatial patterns of residential segregation over time (2009-2020). The dynamic analysis shows the emergence of clusters on the dimensions of migration background, age group, gender and socio-economic variables. We also identified a structural change in 2015, resulting in a new cluster in Berlin that reflects the changing distribution of subpopulations with a particular migratory background. Finally, we discuss the findings of this study with previous research and suggest possibilities for policy applications and future research using a dynamic clustering approach for analyzing changes in residential segregation at the city level.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"110 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140116828","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
EPJ Data SciencePub Date : 2024-03-08DOI: 10.1140/epjds/s13688-024-00452-7
{"title":"Large-scale digital signatures of emotional response to the COVID-19 vaccination campaign","authors":"","doi":"10.1140/epjds/s13688-024-00452-7","DOIUrl":"https://doi.org/10.1140/epjds/s13688-024-00452-7","url":null,"abstract":"<h3>Abstract</h3> <p>The same individuals can express very different emotions in online social media with respect to face-to-face interactions, partially because of intrinsic limitations of the digital environments and partially because of their algorithmic design, which is optimized to maximize engagement. Such differences become even more pronounced for topics concerning socially sensitive and polarizing issues, such as massive pharmaceutical interventions. Here, we investigate how online emotional responses change during the large-scale COVID-19 vaccination campaign with respect to a baseline in which no specific contentious topic dominates. We show that the online discussions during the pandemic generate a vast spectrum of emotional response compared to the baseline, especially when we take into account the characteristics of the users and the type of information shared in the online platform. Furthermore, we analyze the role of the political orientation of shared news, whose circulation seems to be driven not only by their actual informational content but also by the social need to strengthen one’s affiliation to, and positioning within, a specific online community by means of emotionally arousing posts. Our findings stress the importance of better understanding the emotional reactions to contentious topics at scale from digital signatures, while providing a more quantitative assessment of the ongoing online social dynamics to build a faithful picture of offline social implications.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"35 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140070981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}