Jianguo Sun , Yifan Jia , Yanbin Wang , Ye Tian , Sheng Zhang
{"title":"Ethereum fraud detection via joint transaction language model and graph representation learning","authors":"Jianguo Sun , Yifan Jia , Yanbin Wang , Ye Tian , Sheng Zhang","doi":"10.1016/j.inffus.2025.103074","DOIUrl":null,"url":null,"abstract":"<div><div>Ethereum faces growing fraud threats. Current fraud detection methods, whether employing graph neural networks or sequence models, fail to consider the semantic information and similarity patterns within transactions. Moreover, these approaches do not leverage the potential synergistic benefits of combining both types of models. To address these challenges, we propose TLMG4Eth that combines a transaction language model with graph-based methods to capture semantic, similarity, and structural features of transaction data in Ethereum. We first propose a transaction language model that converts numerical transaction data into meaningful transaction sentences, enabling the model to learn explicit transaction semantics. Then, we propose a transaction attribute similarity graph to learn transaction similarity information, enabling us to capture intuitive insights into transaction anomalies. Additionally, we construct an account interaction graph to capture the structural information of the account transaction network. We employ a deep Multi-Head Attention Network to fuse transaction semantic and similarity embeddings, and ultimately propose a joint training approach for the Multi-Head Attention Network and the account interaction graph to obtain the synergistic benefits of both. Our model achieves performance improvements ranging from 9.62% to 13.2% over state-of-the-art methods on two public datasets and a newly introduced dataset. Our code is available at the following link: <span><span>https://github.com/lincozz/TLmGNN</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"120 ","pages":"Article 103074"},"PeriodicalIF":14.7000,"publicationDate":"2025-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1566253525001472","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Ethereum faces growing fraud threats. Current fraud detection methods, whether employing graph neural networks or sequence models, fail to consider the semantic information and similarity patterns within transactions. Moreover, these approaches do not leverage the potential synergistic benefits of combining both types of models. To address these challenges, we propose TLMG4Eth that combines a transaction language model with graph-based methods to capture semantic, similarity, and structural features of transaction data in Ethereum. We first propose a transaction language model that converts numerical transaction data into meaningful transaction sentences, enabling the model to learn explicit transaction semantics. Then, we propose a transaction attribute similarity graph to learn transaction similarity information, enabling us to capture intuitive insights into transaction anomalies. Additionally, we construct an account interaction graph to capture the structural information of the account transaction network. We employ a deep Multi-Head Attention Network to fuse transaction semantic and similarity embeddings, and ultimately propose a joint training approach for the Multi-Head Attention Network and the account interaction graph to obtain the synergistic benefits of both. Our model achieves performance improvements ranging from 9.62% to 13.2% over state-of-the-art methods on two public datasets and a newly introduced dataset. Our code is available at the following link: https://github.com/lincozz/TLmGNN.
期刊介绍:
Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.