EPJ Data SciencePub Date : 2024-03-05DOI: 10.1140/epjds/s13688-024-00457-2
Shijia Song, Handong Li
{"title":"Early warning signals for stock market crashes: empirical and analytical insights utilizing nonlinear methods","authors":"Shijia Song, Handong Li","doi":"10.1140/epjds/s13688-024-00457-2","DOIUrl":"https://doi.org/10.1140/epjds/s13688-024-00457-2","url":null,"abstract":"<p>This study introduces a comprehensive framework grounded in recurrence analysis, a tool of nonlinear dynamics, to detect potential early warning signals (EWS) for imminent phase transitions in financial systems, with the primary goal of anticipating severe financial crashes. We first conduct a simulation experiment to demonstrate that the indicators based on multiplex recurrence networks (MRNs), namely the average mutual information and the average edge overlap, can indicate state transitions in complex systems. Subsequently, we consider the constituent stocks of the China’s and the U.S. stock markets as empirical subjects, and establish MRNs based on multidimensional returns to monitor the nonlinear dynamics of market through the corresponding the indicators and topological structures. Empirical findings indicate that the primary indicators of MRNs offer valuable insights into significant financial events or periods of extreme instability. Notably, average mutual information demonstrates promise as an effective EWS for forecasting forthcoming financial crashes. An in-depth discussion and elucidation of the theoretical underpinnings for employing indicators of MRNs as EWS, the differences in indicator effectiveness, and the possible reasons for variations in the performance of the EWS across the two markets are provided. This paper contributes to the ongoing discourse on early warning extreme market volatility, emphasizing the applicability of recurrence analysis in predicting financial crashes.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"11 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140034872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
EPJ Data SciencePub Date : 2024-02-28DOI: 10.1140/epjds/s13688-024-00453-6
Wenlong Yang, Yang Wang
{"title":"Higher-order structures of local collaboration networks are associated with individual scientific productivity","authors":"Wenlong Yang, Yang Wang","doi":"10.1140/epjds/s13688-024-00453-6","DOIUrl":"https://doi.org/10.1140/epjds/s13688-024-00453-6","url":null,"abstract":"<p>The prevalence of teamwork in contemporary science has raised new questions about collaboration networks and the potential impact on research outcomes. Previous studies primarily focused on pairwise interactions between scientists when constructing collaboration networks, potentially overlooking group interactions among scientists. In this study, we introduce a higher-order network representation using algebraic topology to capture multi-agent interactions, i.e., simplicial complexes. Our main objective is to investigate the influence of higher-order structures in local collaboration networks on the productivity of the focal scientist. Leveraging a dataset comprising more than 3.7 million scientists from the Microsoft Academic Graph, we uncover several intriguing findings. Firstly, we observe an inverted U-shaped relationship between the number of disconnected components in the local collaboration network and scientific productivity. Secondly, there is a positive association between the presence of higher-order loops and individual scientific productivity, indicating the intriguing role of higher-order structures in advancing science. Thirdly, these effects hold across various scientific domains and scientists with different impacts, suggesting strong generalizability of our findings. The findings highlight the role of higher-order loops in shaping the development of individual scientists, thus may have implications for nurturing scientific talent and promoting innovative breakthroughs.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"46 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140006988","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
EPJ Data SciencePub Date : 2024-02-26DOI: 10.1140/epjds/s13688-023-00433-2
Sarah Shugars
{"title":"Critical computational social science","authors":"Sarah Shugars","doi":"10.1140/epjds/s13688-023-00433-2","DOIUrl":"https://doi.org/10.1140/epjds/s13688-023-00433-2","url":null,"abstract":"<p>In her 2021 IC2S2 keynote talk, “Critical Data Theory,” Margaret Hu builds off Critical Race Theory, privacy law, and big data surveillance to grapple with questions at the intersection of big data and legal jurisprudence. As a legal scholar, Hu’s work focuses primarily on issues of governance and regulation—examining the legal and constitutional impact of modern data collection and analysis. Yet, her call for Critical Data Theory has important implications for the field of Computational Social Science (CSS) as a whole. In this article, I therefore reflect on Hu’s conception of Critical Data Theory and its broader implications for CSS research. Specifically, I’ll consider the ramifications of her work for the scientific community—exploring how we as researchers should think about the ethics and realities of the data which forms the foundations of our work.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"57 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139980944","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
EPJ Data SciencePub Date : 2024-02-26DOI: 10.1140/epjds/s13688-023-00443-0
{"title":"Thinking spatially in computational social science","authors":"","doi":"10.1140/epjds/s13688-023-00443-0","DOIUrl":"https://doi.org/10.1140/epjds/s13688-023-00443-0","url":null,"abstract":"<h3>Abstract</h3> <p>Deductive and theory-driven research starts by asking questions. Finding tentative answers to these questions in the literature is next. It is followed by gathering, preparing and modelling relevant data to empirically test these tentative answers. Inductive research, on the other hand, starts with data representation and finding general patterns in data. Ahn suggested, in his keynote speech at the seventh International Conference on Computational Social Science (IC<sup>2</sup>S<sup>2</sup>) 2021, that the way this data is represented could shape our understanding and the type of answers we find for the questions. He discussed that specific representation learning approaches enable a meaningful embedding space and could allow spatial thinking and broaden computational imagination. In this commentary, I summarize Ahn’s keynote and related publications, provide an overview of the use of spatial metaphor in sociology, discuss how such representation learning can help both inductive and deductive research, propose future avenues of research that could benefit from spatial thinking, and pose some still open questions.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"22 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139981035","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
EPJ Data SciencePub Date : 2024-02-20DOI: 10.1140/epjds/s13688-024-00451-8
{"title":"Charting mobility patterns in the scientific knowledge landscape","authors":"","doi":"10.1140/epjds/s13688-024-00451-8","DOIUrl":"https://doi.org/10.1140/epjds/s13688-024-00451-8","url":null,"abstract":"<h3>Abstract</h3> <p>From small steps to great leaps, metaphors of spatial mobility abound to describe discovery processes. Here, we ground these ideas in formal terms by systematically studying mobility patterns in the scientific knowledge landscape. We use low-dimensional embedding techniques to create a knowledge space made up of 1.5 million articles from the fields of physics, computer science, and mathematics. By analyzing the publication histories of individual researchers, we discover patterns of scientific mobility that closely resemble physical mobility. In aggregate, the trajectories form mobility flows that can be described by a gravity model, with jumps more likely to occur in areas of high density and less likely to occur over longer distances. We identify two types of researchers from their individual mobility patterns: interdisciplinary <em>explorers</em> who pioneer new fields, and <em>exploiters</em> who are more likely to stay within their specific areas of expertise. Our results suggest that spatial mobility analysis is a valuable tool for understanding the evolution of science.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"17 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139925078","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
EPJ Data SciencePub Date : 2024-02-20DOI: 10.1140/epjds/s13688-023-00446-x
Luca Mungo, Silvia Bartolucci, Laura Alessandretti
{"title":"Cryptocurrency co-investment network: token returns reflect investment patterns","authors":"Luca Mungo, Silvia Bartolucci, Laura Alessandretti","doi":"10.1140/epjds/s13688-023-00446-x","DOIUrl":"https://doi.org/10.1140/epjds/s13688-023-00446-x","url":null,"abstract":"<p>Since the introduction of Bitcoin in 2009, the dramatic and unsteady evolution of the cryptocurrency market has also been driven by large investments by traditional and cryptocurrency-focused hedge funds. Notwithstanding their critical role, our understanding of the relationship between institutional investments and the evolution of the cryptocurrency market has remained limited, also due to the lack of comprehensive data describing investments over time. In this study, we present a quantitative study of cryptocurrency institutional investments based on a dataset collected for 1324 currencies in the period between 2014 and 2022 from Crunchbase, one of the largest platforms gathering business information. We show that the evolution of the cryptocurrency market capitalization is highly correlated with the size of institutional investments, thus confirming their important role. Further, we find that the market is dominated by the presence of a group of prominent investors who tend to specialise by focusing on particular technologies. Finally, studying the co-investment network of currencies that share common investors, we show that assets with shared investors tend to be characterized by similar market behaviour. Our work sheds light on the role played by institutional investors and provides a basis for further research on their influence in the cryptocurrency ecosystem.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"23 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139925405","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
EPJ Data SciencePub Date : 2024-01-31DOI: 10.1140/epjds/s13688-024-00449-2
Manjin Shao, Hong Fan
{"title":"Identifying the systemic importance and systemic vulnerability of financial institutions based on portfolio similarity correlation network","authors":"Manjin Shao, Hong Fan","doi":"10.1140/epjds/s13688-024-00449-2","DOIUrl":"https://doi.org/10.1140/epjds/s13688-024-00449-2","url":null,"abstract":"<p>The indirect correlation among financial institutions, stemming from similarities in their portfolios, is a primary driver of systemic risk. However, most existing research overlooks the influence of portfolio similarity among various types of financial institutions on this risk. Therefore, we construct the network of portfolio similarity correlations among different types of financial institutions, based on measurements of portfolio similarity. Utilizing the expanded fire sale contagion model, we offer a comprehensive assessment of systemic risk for Chinese financial institutions. Initially, we introduce indicators for systemic risk, systemic importance, and systemic vulnerability. Subsequently, we examine the cross-sectional and time-series characteristics of these institutions’ systemic importance and vulnerability within the context of the portfolio similarity correlation network. Our empirical findings reveal a high degree of portfolio similarity between banks and insurance companies, contrasted with lower similarity between banks and securities firms. Moreover, when considering the portfolio similarity correlation network, both the systemic importance and vulnerability of Chinese banks and insurance companies surpass those of securities firms in both cross-sectional and temporal dimensions. Notably, our analysis further illustrates that a financial institution’s systemic importance and vulnerability are strongly and positively associated with the magnitude of portfolio similarity between that institution and others.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"2 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139645070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
EPJ Data SciencePub Date : 2024-01-31DOI: 10.1140/epjds/s13688-024-00450-9
Bao Tran Truong, Oliver Melbourne Allen, Filippo Menczer
{"title":"Account credibility inference based on news-sharing networks","authors":"Bao Tran Truong, Oliver Melbourne Allen, Filippo Menczer","doi":"10.1140/epjds/s13688-024-00450-9","DOIUrl":"https://doi.org/10.1140/epjds/s13688-024-00450-9","url":null,"abstract":"<p>The spread of misinformation poses a threat to the social media ecosystem. Effective countermeasures to mitigate this threat require that social media platforms be able to accurately detect low-credibility accounts even before the content they share can be classified as misinformation. Here we present methods to infer account credibility from information diffusion patterns, in particular leveraging two networks: the reshare network, capturing an account’s trust in other accounts, and the bipartite account-source network, capturing an account’s trust in media sources. We extend network centrality measures and graph embedding techniques, systematically comparing these algorithms on data from diverse contexts and social media platforms. We demonstrate that both kinds of trust networks provide useful signals for estimating account credibility. Some of the proposed methods yield high accuracy, providing promising solutions to promote the dissemination of reliable information in online communities. Two kinds of homophily emerge from our results: accounts tend to have similar credibility if they reshare each other’s content or share content from similar sources. Our methodology invites further investigation into the relationship between accounts and news sources to better characterize misinformation spreaders.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"23 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139645231","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
EPJ Data SciencePub Date : 2024-01-29DOI: 10.1140/epjds/s13688-024-00448-3
Michele Coscia
{"title":"Which sport is becoming more predictable? A cross-discipline analysis of predictability in team sports","authors":"Michele Coscia","doi":"10.1140/epjds/s13688-024-00448-3","DOIUrl":"https://doi.org/10.1140/epjds/s13688-024-00448-3","url":null,"abstract":"<p>Professional sports are a cultural activity beloved by many, and a global hundred-billion-dollar industry. In this paper, we investigate the trends of match outcome predictability, assuming that the public is more interested in an event if there is some uncertainty about who will win. We reproduce previous methodology focused on soccer and we expand it by analyzing more than 300,000 matches in the 1996-2023 period from nine disciplines, to identify which disciplines are getting more/less predictable over time. We investigate the home advantage effect, since it can affect outcome predictability and it has been impacted by the COVID-19 pandemic. Going beyond previous work, we estimate which sport management model – between the egalitarian one popular in North America and the rich-get-richer used in Europe – leads to more uncertain outcomes. Our results show that there is no generalized trend in predictability across sport disciplines, that home advantage has been decreasing independently from the pandemic, and that sports managed with the egalitarian North American approach tend to be less predictable. We base our result on a predictive model that ranks team by analyzing the directed network of who-beats-whom, where the most central teams in the network are expected to be the best performing ones. Our results are robust to the measure we use for the prediction.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"43 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139587346","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
EPJ Data SciencePub Date : 2024-01-19DOI: 10.1140/epjds/s13688-023-00442-1
Francesco Carli, Pietro Foini, Nicolò Gozzi, Nicola Perra, Rossano Schifanella
{"title":"Modeling teams performance using deep representational learning on graphs","authors":"Francesco Carli, Pietro Foini, Nicolò Gozzi, Nicola Perra, Rossano Schifanella","doi":"10.1140/epjds/s13688-023-00442-1","DOIUrl":"https://doi.org/10.1140/epjds/s13688-023-00442-1","url":null,"abstract":"<p>Most human activities require collaborations within and across formal or informal teams. Our understanding of how the collaborative efforts spent by teams relate to their performance is still a matter of debate. Teamwork results in a highly interconnected ecosystem of potentially overlapping components where tasks are performed in interaction with team members and across other teams. To tackle this problem, we propose a graph neural network model to predict a team’s performance while identifying the drivers determining such outcome. In particular, the model is based on three architectural channels: topological, centrality, and contextual, which capture different factors potentially shaping teams’ success. We endow the model with two attention mechanisms to boost model performance and allow interpretability. A first mechanism allows pinpointing key members inside the team. A second mechanism allows us to quantify the contributions of the three driver effects in determining the outcome performance. We test model performance on various domains, outperforming most classical and neural baselines. Moreover, we include synthetic datasets designed to validate how the model disentangles the intended properties on which our model vastly outperforms baselines.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"29 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139508999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}