{"title":"LATEX-GCL: Large Language Models (LLMs)-Based Data Augmentation for Text-Attributed Graph Contrastive Learning","authors":"Haoran Yang, Xiangyu Zhao, Sirui Huang, Qing Li, Guandong Xu","doi":"arxiv-2409.01145","DOIUrl":"https://doi.org/arxiv-2409.01145","url":null,"abstract":"Graph Contrastive Learning (GCL) is a potent paradigm for self-supervised\u0000graph learning that has attracted attention across various application\u0000scenarios. However, GCL for learning on Text-Attributed Graphs (TAGs) has yet\u0000to be explored. Because conventional augmentation techniques like feature\u0000embedding masking cannot directly process textual attributes on TAGs. A naive\u0000strategy for applying GCL to TAGs is to encode the textual attributes into\u0000feature embeddings via a language model and then feed the embeddings into the\u0000following GCL module for processing. Such a strategy faces three key\u0000challenges: I) failure to avoid information loss, II) semantic loss during the\u0000text encoding phase, and III) implicit augmentation constraints that lead to\u0000uncontrollable and incomprehensible results. In this paper, we propose a novel\u0000GCL framework named LATEX-GCL to utilize Large Language Models (LLMs) to\u0000produce textual augmentations and LLMs' powerful natural language processing\u0000(NLP) abilities to address the three limitations aforementioned to pave the way\u0000for applying GCL to TAG tasks. Extensive experiments on four high-quality TAG\u0000datasets illustrate the superiority of the proposed LATEX-GCL method. The\u0000source codes and datasets are released to ease the reproducibility, which can\u0000be accessed via this link: https://anonymous.4open.science/r/LATEX-GCL-0712.","PeriodicalId":501032,"journal":{"name":"arXiv - CS - Social and Information Networks","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142214915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"When Heterophily Meets Heterogeneous Graphs: Latent Graphs Guided Unsupervised Representation Learning","authors":"Zhixiang Shen, Zhao Kang","doi":"arxiv-2409.00687","DOIUrl":"https://doi.org/arxiv-2409.00687","url":null,"abstract":"Unsupervised heterogeneous graph representation learning (UHGRL) has gained\u0000increasing attention due to its significance in handling practical graphs\u0000without labels. However, heterophily has been largely ignored, despite its\u0000ubiquitous presence in real-world heterogeneous graphs. In this paper, we\u0000define semantic heterophily and propose an innovative framework called Latent\u0000Graphs Guided Unsupervised Representation Learning (LatGRL) to handle this\u0000problem. First, we develop a similarity mining method that couples global\u0000structures and attributes, enabling the construction of fine-grained homophilic\u0000and heterophilic latent graphs to guide the representation learning. Moreover,\u0000we propose an adaptive dual-frequency semantic fusion mechanism to address the\u0000problem of node-level semantic heterophily. To cope with the massive scale of\u0000real-world data, we further design a scalable implementation. Extensive\u0000experiments on benchmark datasets validate the effectiveness and efficiency of\u0000our proposed framework. The source code and datasets have been made available\u0000at https://github.com/zxlearningdeep/LatGRL.","PeriodicalId":501032,"journal":{"name":"arXiv - CS - Social and Information Networks","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142214728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Anthony Bonato, Juan Sebastian Chavez Palan, Adam Szava
{"title":"Enhancing Anti-Money Laundering Efforts with Network-Based Algorithms","authors":"Anthony Bonato, Juan Sebastian Chavez Palan, Adam Szava","doi":"arxiv-2409.00823","DOIUrl":"https://doi.org/arxiv-2409.00823","url":null,"abstract":"The global banking system has faced increasing challenges in combating money\u0000laundering, necessitating advanced methods for detecting suspicious\u0000transactions. Anti-money laundering (or AML) approaches have often relied on\u0000predefined thresholds and machine learning algorithms using flagged transaction\u0000data, which are limited by the availability and accuracy of existing datasets.\u0000In this paper, we introduce a novel algorithm that leverages network analysis\u0000to detect potential money laundering activities within large-scale transaction\u0000data. Utilizing an anonymized transactional dataset from Co\"operatieve\u0000Rabobank U.A., our method combines community detection via the Louvain\u0000algorithm and small cycle detection to identify suspicious transaction patterns\u0000below the regulatory reporting thresholds. Our approach successfully identifies\u0000cycles of transactions that may indicate layering steps in money laundering,\u0000providing a valuable tool for financial institutions to enhance their AML\u0000efforts. The results suggest the efficacy of our algorithm in pinpointing\u0000potentially illicit activities that evade current detection methods.","PeriodicalId":501032,"journal":{"name":"arXiv - CS - Social and Information Networks","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142227837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Meng Qin, Chaorui Zhang, Yu Gao, Yibin Ding, Weipeng Jiang, Weixi Zhang, Wei Han, Bo Bai
{"title":"Towards Faster Graph Partitioning via Pre-training and Inductive Inference","authors":"Meng Qin, Chaorui Zhang, Yu Gao, Yibin Ding, Weipeng Jiang, Weixi Zhang, Wei Han, Bo Bai","doi":"arxiv-2409.00670","DOIUrl":"https://doi.org/arxiv-2409.00670","url":null,"abstract":"Graph partitioning (GP) is a classic problem that divides the node set of a\u0000graph into densely-connected blocks. Following the IEEE HPEC Graph Challenge\u0000and recent advances in pre-training techniques (e.g., large-language models),\u0000we propose PR-GPT (Pre-trained & Refined Graph ParTitioning) based on a novel\u0000pre-training & refinement paradigm. We first conduct the offline pre-training\u0000of a deep graph learning (DGL) model on small synthetic graphs with various\u0000topology properties. By using the inductive inference of DGL, one can directly\u0000generalize the pre-trained model (with frozen model parameters) to large graphs\u0000and derive feasible GP results. We also use the derived partition as a good\u0000initialization of an efficient GP method (e.g., InfoMap) to further refine the\u0000quality of partitioning. In this setting, the online generalization and\u0000refinement of PR-GPT can not only benefit from the transfer ability regarding\u0000quality but also ensure high inference efficiency without re-training. Based on\u0000a mechanism of reducing the scale of a graph to be processed by the refinement\u0000method, PR-GPT also has the potential to support streaming GP. Experiments on\u0000the Graph Challenge benchmark demonstrate that PR-GPT can ensure faster GP on\u0000large-scale graphs without significant quality degradation, compared with\u0000running a refinement method from scratch. We will make our code public at\u0000https://github.com/KuroginQin/PRGPT.","PeriodicalId":501032,"journal":{"name":"arXiv - CS - Social and Information Networks","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142214730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiaoyu Zhang, Wenchuan Yang, Jiawei Feng, Bitao Dai, Tianci Bu, Xin Lu
{"title":"GSpect: Spectral Filtering for Cross-Scale Graph Classification","authors":"Xiaoyu Zhang, Wenchuan Yang, Jiawei Feng, Bitao Dai, Tianci Bu, Xin Lu","doi":"arxiv-2409.00338","DOIUrl":"https://doi.org/arxiv-2409.00338","url":null,"abstract":"Identifying structures in common forms the basis for networked systems design\u0000and optimization. However, real structures represented by graphs are often of\u0000varying sizes, leading to the low accuracy of traditional graph classification\u0000methods. These graphs are called cross-scale graphs. To overcome this\u0000limitation, in this study, we propose GSpect, an advanced spectral graph\u0000filtering model for cross-scale graph classification tasks. Compared with other\u0000methods, we use graph wavelet neural networks for the convolution layer of the\u0000model, which aggregates multi-scale messages to generate graph representations.\u0000We design a spectral-pooling layer which aggregates nodes to one node to reduce\u0000the cross-scale graphs to the same size. We collect and construct the\u0000cross-scale benchmark data set, MSG (Multi Scale Graphs). Experiments reveal\u0000that, on open data sets, GSpect improves the performance of classification\u0000accuracy by 1.62% on average, and for a maximum of 3.33% on PROTEINS. On MSG,\u0000GSpect improves the performance of classification accuracy by 15.55% on\u0000average. GSpect fills the gap in cross-scale graph classification studies and\u0000has potential to provide assistance in application research like diagnosis of\u0000brain disease by predicting the brain network's label and developing new drugs\u0000with molecular structures learned from their counterparts in other systems.","PeriodicalId":501032,"journal":{"name":"arXiv - CS - Social and Information Networks","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142214870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Linda Hirsch, Florian Müller, Mari Kruse, Andreas Butz, Robin Welsch
{"title":"Social MediARverse Investigating Users Social Media Content Sharing and Consuming Intentions with Location-Based AR","authors":"Linda Hirsch, Florian Müller, Mari Kruse, Andreas Butz, Robin Welsch","doi":"arxiv-2409.00211","DOIUrl":"https://doi.org/arxiv-2409.00211","url":null,"abstract":"Augmented Reality (AR) is evolving to become the next frontier in social\u0000media, merging physical and virtual reality into a living metaverse, a Social\u0000MediARverse. With this transition, we must understand how different contexts\u0000(public, semi-public, and private) affect user engagement with AR content. We\u0000address this gap in current research by conducting an online survey with 110\u0000participants, showcasing 36 AR videos, and polling them about the content's fit\u0000and appropriateness. Specifically, we manipulated these three spaces, two forms\u0000of dynamism (dynamic vs. static), and two dimensionalities (2D vs. 3D). Our\u0000findings reveal that dynamic AR content is generally more favorably received\u0000than static content. Additionally, users find sharing and engaging with AR\u0000content in private settings more comfortable than in others. By this, the study\u0000offers valuable insights for designing and implementing future Social\u0000MediARverses and guides industry and academia on content visualization and\u0000contextual considerations.","PeriodicalId":501032,"journal":{"name":"arXiv - CS - Social and Information Networks","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142214971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Argyrios Deligkas, Michelle Döring, Eduard Eiben, Tiger-Lily Goldsmith, George Skretas, Georg Tennigkeit
{"title":"How Many Lines to Paint the City: Exact Edge-Cover in Temporal Graphs","authors":"Argyrios Deligkas, Michelle Döring, Eduard Eiben, Tiger-Lily Goldsmith, George Skretas, Georg Tennigkeit","doi":"arxiv-2408.17107","DOIUrl":"https://doi.org/arxiv-2408.17107","url":null,"abstract":"Logistics and transportation networks require a large amount of resources to\u0000realize necessary connections between locations and minimizing these resources\u0000is a vital aspect of planning research. Since such networks have dynamic\u0000connections that are only available at specific times, intricate models are\u0000needed to portray them accurately. In this paper, we study the problem of\u0000minimizing the number of resources needed to realize a dynamic network, using\u0000the temporal graphs model. In a temporal graph, edges appear at specific points\u0000in time. Given a temporal graph and a natural number k, we ask whether we can\u0000cover every temporal edge exactly once using at most k temporal journeys; in a\u0000temporal journey consecutive edges have to adhere to the order of time. We\u0000conduct a thorough investigation of the complexity of the problem with respect\u0000to four dimensions: (a) whether the type of the temporal journey is a walk, a\u0000trail, or a path; (b) whether the chronological order of edges in the journey\u0000is strict or non-strict; (c) whether the temporal graph is directed or\u0000undirected; (d) whether the start and end points of each journey are given or\u0000not. We almost completely resolve the complexity of all these problems and\u0000provide dichotomies for each one of them with respect to k.","PeriodicalId":501032,"journal":{"name":"arXiv - CS - Social and Information Networks","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142214914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"LLMs hallucinate graphs too: a structural perspective","authors":"Erwan Le Merrer, Gilles Tredan","doi":"arxiv-2409.00159","DOIUrl":"https://doi.org/arxiv-2409.00159","url":null,"abstract":"It is known that LLMs do hallucinate, that is, they return incorrect\u0000information as facts. In this paper, we introduce the possibility to study\u0000these hallucinations under a structured form: graphs. Hallucinations in this\u0000context are incorrect outputs when prompted for well known graphs from the\u0000literature (e.g. Karate club, Les Mis'erables, graph atlas). These\u0000hallucinated graphs have the advantage of being much richer than the factual\u0000accuracy -- or not -- of a fact; this paper thus argues that such rich\u0000hallucinations can be used to characterize the outputs of LLMs. Our first\u0000contribution observes the diversity of topological hallucinations from major\u0000modern LLMs. Our second contribution is the proposal of a metric for the\u0000amplitude of such hallucinations: the Graph Atlas Distance, that is the average\u0000graph edit distance from several graphs in the graph atlas set. We compare this\u0000metric to the Hallucination Leaderboard, a hallucination rank that leverages\u000010,000 times more prompts to obtain its ranking.","PeriodicalId":501032,"journal":{"name":"arXiv - CS - Social and Information Networks","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142214727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Service-Oriented AoI Modeling and Analysis for Non-Terrestrial Networks","authors":"Zheng Guo, Qian Chen, Weixiao Meng","doi":"arxiv-2408.17051","DOIUrl":"https://doi.org/arxiv-2408.17051","url":null,"abstract":"To achieve truly seamless global intelligent connectivity, non-terrestrial\u0000networks (NTN) mainly composed of low earth orbit (LEO) satellites and drones\u0000are recognized as important components of the future 6G network architecture.\u0000Meanwhile, the rapid advancement of the Internet of Things (IoT) has led to the\u0000proliferation of numerous applications with stringent requirements for timely\u0000information delivery. The Age of Information (AoI), a critical performance\u0000metric for assessing the freshness of data in information update systems, has\u0000gained significant importance in this context. However, existing modeling and\u0000analysis work on AoI mainly focuses on terrestrial networks, and the\u0000distribution characteristics of ground nodes and the high dynamics of\u0000satellites have not been fully considered, which poses challenges for more\u0000accurate evaluation. Against this background, we model the ground nodes as a\u0000hybrid distribution of Poisson point process (PPP) and Poisson cluster process\u0000(PCP) to capture the impact of ground node distribution on the AoI of status\u0000update packet transmission supported by UAVs and satellites in NTN, and the\u0000visibility and cross-traffic characteristics of satellites are additionally\u0000considered. We derived the average AoI for the system in these two different\u0000situations and examined the impact of various network parameters on AoI\u0000performance.","PeriodicalId":501032,"journal":{"name":"arXiv - CS - Social and Information Networks","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142214916","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Victor Brabant, Yasaman Asgari, Pierre Borgnat, Angela Bonifati, Remy Cazabet
{"title":"Longitudinal Modularity, a Modularity for Link Streams","authors":"Victor Brabant, Yasaman Asgari, Pierre Borgnat, Angela Bonifati, Remy Cazabet","doi":"arxiv-2408.16877","DOIUrl":"https://doi.org/arxiv-2408.16877","url":null,"abstract":"Temporal networks are commonly used to model real-life phenomena. When these\u0000phenomena represent interactions and are captured at a fine-grained temporal\u0000resolution, they are modeled as link streams. Community detection is an\u0000essential network analysis task. Although many methods exist for static\u0000networks, and some methods have been developed for temporal networks\u0000represented as sequences of snapshots, few works can handle link streams. This\u0000article introduces the first adaptation of the well-known Modularity quality\u0000function to link streams. Unlike existing methods, it is independent of the\u0000time scale of analysis. After introducing the quality function, and its\u0000relation to existing static and dynamic definitions of Modularity, we show\u0000experimentally its relevance for dynamic community evaluation.","PeriodicalId":501032,"journal":{"name":"arXiv - CS - Social and Information Networks","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142214920","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}