{"title":"Adaptive header identification and unsupervised clustering strategy for enhanced protocol reverse engineering","authors":"Mingliang Zhu, Chunxiang Gu, Xieli Zhang, Qingjun Yuan, Mengcheng Ju, Guanping Zhang, Xi Chen","doi":"10.1016/j.eswa.2025.128467","DOIUrl":"10.1016/j.eswa.2025.128467","url":null,"abstract":"<div><div>Protocol reverse engineering is critical for ensuring network security and understanding proprietary communication mechanisms. Most traditional network trace-based methods face challenges such as high computational complexity, excessive memory usage, and sensitivity to payload variations. In this paper, we propose a method that integrates adaptive message header recognition with unsupervised clustering strategies for protocol reverse engineering. Utilizing mean entropy change and change point detection algorithms, our method automatically identifies message headers, reducing the impact of payload variations on similarity measurements. Following this, our method significantly reduces computational resource consumption while maintaining clustering performance, by clustering based on a small set of selected core samples of message headers and assigning the remaining samples to existing categories. Moreover, leveraging the identified message headers, we incorporate a hierarchical format inference technique and design a function code field detector, which enhances the accuracy and efficiency of protocol reverse engineering. Our evaluation across eight widely used protocols demonstrates that our method achieves homogeneity and completeness scores of 0.94 and 0.74, respectively, in message type identification. These results significantly outperform existing protocol reverse engineering tools on the same datasets: MFD&DBSCAN (0.31, 0.73), NEMETYL (0.73, 0.64), and Netzob (0.34, 0.76). Furthermore, our method achieves a perfection score <span><math><mrow><mn>1.2</mn><mo>×</mo></mrow></math></span> higher than Binaryinferno in format inference.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"291 ","pages":"Article 128467"},"PeriodicalIF":7.5,"publicationDate":"2025-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144280421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lei Gao , Zhihao Liu , Qinhe Gao , Hongjie Cheng , Jianyong Yao , Xiaoli Zhao , Sixiang Jia
{"title":"Graph isomorphism wavelet convolutional networks for small-sample fault diagnosis of rotating machinery using multi-sensor information fusion","authors":"Lei Gao , Zhihao Liu , Qinhe Gao , Hongjie Cheng , Jianyong Yao , Xiaoli Zhao , Sixiang Jia","doi":"10.1016/j.eswa.2025.128615","DOIUrl":"10.1016/j.eswa.2025.128615","url":null,"abstract":"<div><div>Multi-sensor fault monitoring of large rotating machinery inevitably encounters the problem of limited learning samples, complicating the establishment of consistent representations between monitoring data and fault attributes. To tackle this issue, a graph isomorphism wavelet convolutional network (GIWCN) is proposed for small-sample fault diagnosis with multi-sensor data fusion. GIWCN incorporates the Weisfeiler-Lehman (WL) algorithm with the interpretable node feature propagation mechanism of the graph wavelet transform (GWT) which associates multi-sensor information and discriminative structural characteristics to achieve injective isomorphic feature mapping in the spectral graph wavelet domain. To exploit the consistency of fault attributes among similar samples, isomorphic graph samples are constructed with a global topological structure under the same health states. Subsequently, graph isomorphism wavelet convolutional layer (GIWConv) is designed by embedding Multi-Layer Perceptrons (MLPs) within the GWT, thus mapping the isomorphic graphs into the same state space while ensuring the locality and sparsity of graph convolutions. Additionally, an adaptive thresholding denoising (ATD) module is integrated into the GIWConv layer to further enhance the stability of feature mapping for small samples. Finally, the isomorphic discriminative capability of GIWCN is validated on two challenging rotating machinery fault datasets, with small-sample proportions ranging from 20% to 3%. Compared to five state-of-the-art models, experimental results show that GIWCN achieves the highest diagnostic accuracy.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"292 ","pages":"Article 128615"},"PeriodicalIF":7.5,"publicationDate":"2025-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144313922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Retrieval-augmented code completion for local projects using large language models","authors":"Marko Hostnik , Marko Robnik-Šikonja","doi":"10.1016/j.eswa.2025.128596","DOIUrl":"10.1016/j.eswa.2025.128596","url":null,"abstract":"<div><div>The use of large language models (LLMs) is becoming increasingly widespread among software developers. However, privacy and computational requirements are problematic with commercial solutions and the use of LLMs. In this work, we focus on using relatively small and efficient LLMs with 160M parameters that are suitable for local execution and augmentation with retrieval from local projects. We train two open transformer-based models, the generative GPT-2 and the retrieval-adapted RETRO, on open-source Python files, and empirically compare them, confirming the benefits of embedding-based retrieval. Furthermore, we improve our models’ performance with In-context retrieval-augmented generation (RAG), which retrieves code snippets using the Jaccard similarity of tokens. We evaluate In-context RAG on larger models and determine that, despite its simplicity, the approach is more suitable than using the RETRO architecture. Experimental results indicate that In-context RAG improves the code completion baseline by over 26 %, while RETRO improves over the similarly sized GPT-2 baseline by 12 %. We highlight the key role of proper tokenization in achieving the full potential of LLMs in code completion.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"292 ","pages":"Article 128596"},"PeriodicalIF":7.5,"publicationDate":"2025-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144313973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Huatian Gong , Qing Peng , Linwei Liu , Xiaoguang Yang
{"title":"A decision-making system for traffic management during large-scale road network construction","authors":"Huatian Gong , Qing Peng , Linwei Liu , Xiaoguang Yang","doi":"10.1016/j.eswa.2025.128527","DOIUrl":"10.1016/j.eswa.2025.128527","url":null,"abstract":"<div><div>As urban development progresses, large-scale road network construction projects are often required to upgrade key infrastructure. However, such projects pose significant traffic management challenges, including reduced network capacity and increased travel delays. To address these issues, this study proposes an end-to-end decision-making system for managing traffic during large-scale construction. The system consists of three key components: (1) a traffic state modeling module based on the user equilibrium traffic assignment model, which estimates pre-construction traffic conditions; (2) an origin-destination (OD) matrix calibration module using a bi-level optimization model and a gradient-descent-based algorithm, which aligns modeled flows with observed data to improve accuracy by 40 %; and (3) a traffic management strategy module that simulates construction-period scenarios and evaluates mitigation strategies. The system is applied to the Pinglu Canal bridge reconstruction project in Qinzhou, China. The results show that a lane-addition strategy, recommended by the system, can reduce the average peak-hour travel delay per commuter from 6.57 to 5.82 min, achieving an 11.42 % improvement. The proposed system serves as a practical decision-support tool for managing traffic during complex, large-scale infrastructure projects.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"292 ","pages":"Article 128527"},"PeriodicalIF":7.5,"publicationDate":"2025-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144330880","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dhruv Aditya Mittal , Vitor Fortes Rey , Hymalai Bello , Paul Lukowicz , Sungho Suh
{"title":"PACL+: Online continual learning using proxy-anchor and contrastive loss with Gaussian replay for sensor-based human activity recognition","authors":"Dhruv Aditya Mittal , Vitor Fortes Rey , Hymalai Bello , Paul Lukowicz , Sungho Suh","doi":"10.1016/j.eswa.2025.128603","DOIUrl":"10.1016/j.eswa.2025.128603","url":null,"abstract":"<div><div>The increasing prevalence of wearable sensors and devices necessitates the development of Human Activity Recognition (HAR) systems that maintain accurate performance over time. Traditional HAR models, which rely on offline supervised training, struggle to adapt to the dynamic nature of real-world environments, leading to performance degradation due to catastrophic forgetting when new activities or users are introduced. In this paper, we propose a novel continual learning method (PACL+) that integrates Proxy Anchor loss, contrastive learning, and Gaussian replay to mitigate catastrophic forgetting in HAR systems and improve HAR performance. Unlike previous approaches, PACL+ effectively handles the introduction of both new activities and new users in incremental learning steps, addressing real-world challenges such as severe subject-wise class imbalance and user-dependent learning. To improve efficiency, we introduce Gaussian replay, a memory-efficient strategy that selects representative examples for rehearsal, further stabilizing the learning process. We evaluate PACL+ on three benchmark HAR datasets under realistic continual learning scenarios with varying sampling rates and diverse class distributions. Experimental results demonstrate that PACL+ significantly outperforms existing state-of-the-art methods, achieving higher accuracy and F1 scores while preserving performance on previously learned activities.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"292 ","pages":"Article 128603"},"PeriodicalIF":7.5,"publicationDate":"2025-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144314187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Law LLM unlearning via interfere prompt, review output and update parameter: new challenges, method and baseline","authors":"Rui Shao, Yiping Tang, Lingyan Yang, Fang Wang","doi":"10.1016/j.eswa.2025.128612","DOIUrl":"10.1016/j.eswa.2025.128612","url":null,"abstract":"<div><div>Law large language models (Law LLMs) often generate hallucinations, one of the reasons is due to memorizing sensitive, inaccurate, or outdated information, and unlearning such information has important research value, yet the legal domain lacks publicly available datasets and effective methods for LLM unlearning tasks and evaluations. This work innovatively proposes three unlearning tasks in legal field, providing new datasets for unlearning tasks. In addition, proposes a Law LLM unlearning method via loss adjustment with only need forgotten sequence (UNFS), providing a new baseline and unlearning method for the unlearning tasks. Further, after UNFS unlearning, proposing an inference method for Law LLMs that combines interfering input and reviewing output, reinforcing that Law LLMs avoid including erroneous information in the output. Designing a new metric for law LLM unlearning, the legal data memory evaluation method (LawME), LawME automatically judges the output quality of Law LLMs by comparing the content output by Law LLMs with the ground truth. Real-world dataset experiments and analyses validate UNFS’s effectiveness: on the three proposed legal unlearning datasets, UNFS’s accuracy decreases by 16.53 %, perplexity increased by 3.94, and AUC decreased by 16.09 %. On the retained datasets, UNFS’s accuracy only decreased by 0.02 %-0.26 %, and on the generalized task MMLU by only 0.07 %-0.15 %. These results demonstrate that UNFS has excellent unlearning performance and does not harm the performance on other data that do not participate in unlearning. Other experiments and analyses verified the validity of the proposed inference approach, LawME metrics.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"292 ","pages":"Article 128612"},"PeriodicalIF":7.5,"publicationDate":"2025-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144307113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"VSGNet: visual saliency guided network for skin lesion segmentation","authors":"Zhefei Cai , Yingle Fan , Tao Fang , Wei Wu","doi":"10.1016/j.eswa.2025.128635","DOIUrl":"10.1016/j.eswa.2025.128635","url":null,"abstract":"<div><div>The accuracy of skin lesion segmentation is of great significance for the subsequent clinical diagnosis. In order to improve the segmentation accuracy, some pioneering works tried to embed multiple complex modules, or used the huge Transformer framework, but due to the limitation of computing resources, these type of large models were not suitable for the actual clinical environment. To address the coexistence challenges of precision and lightweight, we propose a visual saliency guided network (VSGNet) for skin lesion segmentation, which generates saliency images of skin lesions through the efficient attention mechanism of biological vision, and guides the network to quickly locate the target area, so as to solve the localization difficulties in the skin lesion segmentation tasks. VSGNet includes three parts: Color Constancy module, Saliency Detection module and Ultra Lightweight Multi-level Interconnection Network (ULMI-Net). Specially, ULMI-Net uses a U-shaped structure network as the skeleton, including the Adaptive Split Channel Attention (ASCA) module that simulates the parallel mechanism of biological vision dual pathway, and the Channel-Spatial Parallel Attention (CSPA) module inspired by the multi-level interconnection structure of visual cortices. Through these modules, ULMI-Net can balance the efficient extraction and multi-scale fusion of global and local features, and try to achieve the excellent segmentation results at the lowest cost of parameters and computational complexity. To validate the effectiveness and robustness of the proposed VSGNet on three publicly available skin lesion segmentation datasets (ISIC2017, ISIC2018 and PH2 datasets). The experimental results show that compared to other state-of-the-art methods, VSGNet improves the Dice and mIoU metrics by 1.84 % and 3.34 %, respectively, and with a 196 × and 106 × reduction in the number of parameters and computational complexity. This paper constructs the VSGNet integrating the biological vision mechanism and the artificial intelligence algorithm, providing a new idea for the construction of deep learning models guided by the biological vision, promoting the development of biomimetic computational vision as well as.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"292 ","pages":"Article 128635"},"PeriodicalIF":7.5,"publicationDate":"2025-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144330797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xin Lei , Zhen Li , Huaiguang Jiang , Samson S. Yu , Yu Chen , Bin Liu , Peng Shi
{"title":"Deep-learning based optimal PMU placement and fault classification for power system","authors":"Xin Lei , Zhen Li , Huaiguang Jiang , Samson S. Yu , Yu Chen , Bin Liu , Peng Shi","doi":"10.1016/j.eswa.2025.128586","DOIUrl":"10.1016/j.eswa.2025.128586","url":null,"abstract":"<div><div>Phasor measurement units (PMUs) are vital for power grid monitoring, yet their high cost restricts widespread adoption. PMU measurement data is also crucial for fault analysis in power systems. However, existing research seldom explores the interplay between optimal PMU placement (OPP) and fault analysis, impeding advancements in grid economy and security. This study introduces a perception-driven, deep learning-based optimization approach that integrates OPP, multi-task learning, and fault data augmentation. First, deep reinforcement learning optimizes PMU placement, balancing cost-effectiveness with observability requirements. Next, multi-task learning, enhanced by Bayesian optimization, improves fault classification efficiency using PMU data. Finally, pre-trained models paired with <span><math><mi>k</mi></math></span>-means clustering augment fault data, boosting classification accuracy. Extensive simulations across four IEEE standard test systems validate the proposed method’s effectiveness.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"292 ","pages":"Article 128586"},"PeriodicalIF":7.5,"publicationDate":"2025-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144330881","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"FAME: a lightweight spatio-temporal network for model attribution of face-swap deepfakes","authors":"Wasim Ahmad , Yan-Tsung Peng , Yuan-Hao Chang","doi":"10.1016/j.eswa.2025.128571","DOIUrl":"10.1016/j.eswa.2025.128571","url":null,"abstract":"<div><div>The widespread emergence of face-swap Deepfake videos poses growing risks to digital security, privacy, and media integrity, necessitating effective forensic tools for identifying the source of such manipulations. Although most prior research has focused primarily on binary Deepfake detection, the task of model attribution determining which generative model produced a given Deepfake remains underexplored. In this paper, we introduce <strong>FAME</strong> (Fake Attribution via Multilevel Embeddings), a lightweight and efficient spatio-temporal framework designed to capture subtle generative artifacts specific to different face-swap models. FAME integrates spatial and temporal attention mechanisms to improve attribution accuracy while remaining computationally efficient. We evaluate our model on three challenging and diverse datasets, which include Deepfake Detection and Manipulation (DFDM), FaceForensics++ (FF++), and FakeAVCeleb (FAVCeleb). The evaluation results show that FAME consistently performs better than existing methods in both accuracy and runtime, highlighting its potential for deployment in real-world forensic and information security applications. The code and pretrained models will be made publicly available at: <span><span>https://github.com/wasim004/FAME/</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"292 ","pages":"Article 128571"},"PeriodicalIF":7.5,"publicationDate":"2025-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144314216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiaoyu Shi , Eber J. Ávila-Martínez , Yixiao Li , Lei Shi
{"title":"Flocking dynamics for cooperation multi-agent networks subject to intermittent communication","authors":"Xiaoyu Shi , Eber J. Ávila-Martínez , Yixiao Li , Lei Shi","doi":"10.1016/j.eswa.2025.128620","DOIUrl":"10.1016/j.eswa.2025.128620","url":null,"abstract":"<div><div>In this work, we investigate the leader-follower flocking control issue of the Cucker-Smale(C-S) model that involves intermittent communication. It is recognized that intermittent communication occurred randomly in the channels between controllers, sensors, and actuators. In order to assess the flocking behavior of the C-S model, this research suggests a technique based on the product of sub-stochastic matrices. This method establishes the sufficient conditions for flocking behavior, which are contingent on the agent’s weight function, topological structure, and initial state. In contrast to previous results that only apply to specific forms of positive and decreasing weight functions, our findings are more generic and can be applied to any positive and decreasing weight functions with non-zero lower bounds. In addition, through the analysis of the error system, it is ensured that the speeds of all individuals eventually tend to be consistent and converge within a convex hull with an upper bound. Eventually, the validity of our results is verified through simulation examples.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"292 ","pages":"Article 128620"},"PeriodicalIF":7.5,"publicationDate":"2025-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144314221","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}