Francesco Blefari , Cristian Cosentino , Francesco Aurelio Pironti , Angelo Furfaro , Fabrizio Marozzo
{"title":"CyberRAG: An agentic RAG cyber attack classification and reporting tool","authors":"Francesco Blefari , Cristian Cosentino , Francesco Aurelio Pironti , Angelo Furfaro , Fabrizio Marozzo","doi":"10.1016/j.future.2025.108186","DOIUrl":"10.1016/j.future.2025.108186","url":null,"abstract":"<div><div>Intrusion Detection and Prevention Systems (IDS/IPS) in large enterprises can generate hundreds of thousands of alerts per hour, overwhelming analysts with logs requiring rapidly evolving expertise. Conventional machine-learning detectors reduce alert volume but still yield many false positives, while standard Retrieval-Augmented Generation (RAG) pipelines often retrieve irrelevant context and fail to justify predictions. We present CyberRAG, a modular agent-based RAG framework that delivers real-time classification, explanation, and structured reporting for cyber-attacks. A central LLM agent orchestrates: (i) fine-tuned classifiers specialized by attack family; (ii) tool adapters for enrichment and alerting; and (iii) an iterative retrieval-and-reason loop that queries a domain-specific knowledge base until evidence is relevant and self-consistent. Unlike traditional RAG, CyberRAG adopts an agentic design that enables dynamic control flow and adaptive reasoning. This architecture autonomously refines threat labels and natural-language justifications, reducing false positives and enhancing interpretability. It is also extensible: new attack types can be supported by adding classifiers without retraining the core agent. CyberRAG was evaluated on SQL Injection, XSS, and SSTI, achieving over 94 % accuracy per class and a final classification accuracy of 94.92 % through semantic orchestration. Generated explanations reached 0.94 in BERTScore and 4.9/5 in GPT-4-based expert evaluation, with robustness preserved against adversarial and unseen payloads. These results show that agentic, specialist-oriented RAG can combine high detection accuracy with trustworthy, SOC-ready prose, offering a flexible path toward partially automated cyber-defense workflows.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"176 ","pages":"Article 108186"},"PeriodicalIF":6.2,"publicationDate":"2025-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145268857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Vincenzo Maisto , Alessandro Cilardo , Emilio Billi , Chuck Fader
{"title":"A hardware/software architecture for multi-threaded offloading of erasure codes in distributed file systems","authors":"Vincenzo Maisto , Alessandro Cilardo , Emilio Billi , Chuck Fader","doi":"10.1016/j.future.2025.108187","DOIUrl":"10.1016/j.future.2025.108187","url":null,"abstract":"<div><div>Big Data analytics and cloud computing impose an ever-growing demand for data-center providers in terms of computational requirements, latency, and storage. Distributed file systems offer the strategic advantage of scaling-out computing and storage resources, hence allowing for notable speed-ups with massively parallel and distributed computing paradigms. On the other hand, such distributed clusters are constantly challenged with storage failures. Data replication is often deployed to ensure fault tolerance and business continuity, typically in a 3x configuration. This results in expensive 200 % overheads in storage space, write propagation, and energy costs. Erasure codes offer an alternative approach for fault tolerance by allowing reconstruction of erased data chunks, while reducing storage overhead down to 30 %. However, a considerable share of CPU cycles and energy is spent computing such codes, effectively reducing the cluster’s efficiency and starving other user and system tasks. Offloading on a custom accelerator is a non-trivial issue, due to the highly multi-threaded nature of such tasks and the lack of robust multi-threading support in conventional accelerator runtimes.</div><div>In this work, we present a heterogeneous hardware/software architectural design for large-scale and multi-threaded acceleration of distributed erasure codes on PCIe accelerators, and a new abstraction and integration model for distributed accelerators in fault-tolerant storage systems. We enable safe and seamless deployment of multi-threaded SYCL-based IP cores through a hardware thread proxying layer providing software thread-isolation, and integration with cluster-level middlewares. In addition, our design allows for heterogeneous cluster configurations, with full compatibility and transparent integration of heterogeneously-accelerated and CPU-only nodes. We systematically evaluate the individual layers of our architecture and validate design’s integration in a container-based HDFS cluster, comparing performance against the state-of-the-art AVX-512-accelerated ISA-L library and other SYCL substrates, such as GPUs and single-threaded FPGAs.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"176 ","pages":"Article 108187"},"PeriodicalIF":6.2,"publicationDate":"2025-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145268852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Performance analysis of microVMs and containers for edge computing: A focus on file and network I/O","authors":"Kyungwoon Lee , Yunha Choi , Byungchul Tak","doi":"10.1016/j.future.2025.108176","DOIUrl":"10.1016/j.future.2025.108176","url":null,"abstract":"<div><div>Container virtualization has become an indispensable foundation for edge and fog computing. Containers offer several advantages, such as the support for scaling, the ability to replicate tasks, and reduced dependency. These advantages are especially crucial in the edge and fog environment. However, concerns over a relatively low level of security have followed container virtualization for some time. In response, microVM technology has emerged offering stronger isolation and security with performances comparable to containers. In this work, we provide a better understanding of microVM’s feasibility for edge and fog computing compared to containers. We conduct extensive experiments on diverse workloads to test how microVMs compare against containers in several aspects. Through rigorous measurements and analysis, we extract several important findings. Despite having a more complex architecture than containers, microVMs perform comparably to the containers in terms of I/O performance. MicroVMs can even outperform containers in certain I/O workload types by 58 %. Network I/O performance of microVMs can be 2<span><math><mo>×</mo></math></span> better than containers. We provide our findings and insights on the performance characteristics of microVMs on edge devices. With a much higher degree of isolation capability, microVMs can be an attractive virtualization technique for edge computing environments.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"176 ","pages":"Article 108176"},"PeriodicalIF":6.2,"publicationDate":"2025-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145268856","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Class aware efficient client selection and multi-loss guided local performance optimization in federated learning","authors":"Akshay Singh, Rahul Thakur","doi":"10.1016/j.future.2025.108172","DOIUrl":"10.1016/j.future.2025.108172","url":null,"abstract":"<div><div>Federated Learning (FL) enables decentralized model training across multiple edge devices, preserving privacy through local data processing at the edge. While FL reduces data transfer by exchanging only model weights, inherent multiple aggregation steps can increase the communication overhead. Practically, clients’ datasets exhibit significant variation in their label distributions. This means that data across clients is not independently and identically distributed (non-IID) with respect to label classes, creating distinct patterns unique to each client. It causes inconsistencies between local objectives and global optima, which impacts the overall performance of global training. Additionally, trade-offs between performance and energy consumption remain a key challenge mostly ignored by existing FL methods. To address this issue, we introduce a class distribution-aware client selection algorithm guided by a multi-loss function that optimizes both performance and energy consumption across participating edge devices, named CACS-FL. In CACS-FL, firstly, a confidence score is calculated for each edge device based on their heterogeneity score and energy consumption, which is further optimized to select an appropriate set of edge devices. After selection, the clients perform local training using a weighted multi-loss function, which improves personalized performance and achieves higher global performance. Experimental demonstrations on various datasets showcase CACS-FL’s advantages over existing state-of-the-art approaches. CACS-FL also guarantees faster convergence with fairness in performance across the participating edge devices.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"176 ","pages":"Article 108172"},"PeriodicalIF":6.2,"publicationDate":"2025-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145268854","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dongliang Cai , Borui Chen , Liang Zhang , Kexin Li , Haibin Kan
{"title":"Blockchain-enabled reliable outsourced decryption CP-ABE using responsive zkSNARK for mobile computing","authors":"Dongliang Cai , Borui Chen , Liang Zhang , Kexin Li , Haibin Kan","doi":"10.1016/j.future.2025.108182","DOIUrl":"10.1016/j.future.2025.108182","url":null,"abstract":"<div><div>Ciphertext-Policy Attribute-Based Encryption (CP-ABE) is a promising solution for access control in mobile computing. However, the heavy decryption overhead hinders its widespread adoption. A general approach to address this issue is to outsource decryption to a decryption cloud server (DCS). Existing schemes achieve verifiability but lack an effective exemption mechanism to protect honest DCS from false claims. In this paper, we propose a blockchain-enabled reliable outsourced decryption CP-ABE framework that achieves both verifiability and exemptibility without adding redundant information to the ciphertext. We use zkSNARK to verify outsourced results on blockchain efficiently and introduce a challenge-response mechanism to address the high cost of proof generation. Moreover, our framework ensures fair incentive and enables decentralized outsourcing through blockchain. Finally, we implement and evaluate our scheme on Ethereum to demonstrate its feasibility and efficiency. While maintaining almost the same decryption cost, our gas usage is 11<span><math><mo>×</mo></math></span> to 140<span><math><mo>×</mo></math></span> in the happy case and 4<span><math><mo>×</mo></math></span> to 55<span><math><mo>×</mo></math></span> in the challenge case lower than the scheme of Ge et al. (TDSC’24) in attribute numbers from 5 to 60. Building upon the proposed framework, we demonstrate its application in the data sharing of electric vehicles, enabling a more extensive use of mobile computing resources.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"176 ","pages":"Article 108182"},"PeriodicalIF":6.2,"publicationDate":"2025-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145268850","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Manuel Franco De La Peña, Ángel Luis Perales Gómez, Lorenzo Fernández Maimó
{"title":"ShaTS: a Shapley-based explainability method for time series artificial intelligence models","authors":"Manuel Franco De La Peña, Ángel Luis Perales Gómez, Lorenzo Fernández Maimó","doi":"10.1016/j.future.2025.108178","DOIUrl":"10.1016/j.future.2025.108178","url":null,"abstract":"<div><div>Industrial Internet of Things environments increasingly rely on advanced Anomaly Detection and explanation techniques to rapidly detect and mitigate cyberincidents, thereby ensuring operational safety. The sequential nature of data collected from these environments has enabled improvements in Anomaly Detection using Machine Learning and Deep Learning models by processing time windows rather than treating the data as tabular. However, conventional explanation methods often neglect this temporal structure, leading to imprecise or less actionable explanations. This work presents ShaTS (Shapley values for Time Series models), which is a model-agnostic explainable Artificial Intelligence method designed to enhance the precision of Shapley value explanations for time series models. ShaTS addresses the shortcomings of traditional approaches by incorporating an a priori feature grouping strategy that preserves temporal dependencies and produces both coherent and actionable insights. Experiments conducted on the SWaT dataset demonstrate that ShaTS accurately identifies critical time instants, precisely pinpoints the sensors, actuators, and processes affected by anomalies, and outperforms SHAP in terms of both explainability and resource efficiency, fulfilling the real-time requirements of industrial environments.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"176 ","pages":"Article 108178"},"PeriodicalIF":6.2,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145268845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"FedTVD: balancing data quality and quantity for robust federated learning","authors":"Radwan Selo, Majid Kundroo, Taehong Kim","doi":"10.1016/j.future.2025.108177","DOIUrl":"10.1016/j.future.2025.108177","url":null,"abstract":"<div><div>Federated Learning (FL) enables collaborative model training across distributed client devices while preserving data privacy. However, FL faces significant challenges due to data heterogeneity, particularly in terms of label distribution skewness and variations in dataset sizes, which can lead to biased model updates and hinder convergence. To address this, we propose FedTVD, a novel FL algorithm that weights client contributions during aggregation by considering both data quality and quantity. Unlike traditional FL approaches such as FedAvg, which rely solely on dataset size for client weighting, FedTVD integrates Total Variation Distance (TVD) to measure the divergence between each client’s local label distribution and a uniform global distribution. Clients with highly skewed distributions receive lower weights, preventing unbalanced datasets with imbalances from disproportionately influencing the global model. At the same time, dataset size is incorporated to ensure scalability and fairness. This dual-weighting mechanism effectively mitigates the impact of data imbalance, leading to more stable and generalized global models. Experimental results show that FedTVD consistently outperforms state-of-the-art methods across all datasets (FMNIST, CIFAR-10, and CIFAR-100) and all levels of data heterogeneity. Notably, it achieves up to 10.6% improvement over FedAvg on CIFAR-10 under highly skewed data, while maintaining top performance even under moderate and IID settings.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"176 ","pages":"Article 108177"},"PeriodicalIF":6.2,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145268847","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Decentralized edge learning: A comparative study of distillation strategies and dissimilarity measures","authors":"Mbasa Joaquim Molo , Lucia Vadicamo , Claudio Gennaro , Emanuele Carlini","doi":"10.1016/j.future.2025.108171","DOIUrl":"10.1016/j.future.2025.108171","url":null,"abstract":"<div><div>Decentralized learning is emerging as a scalable and privacy-preserving alternative to centralized machine learning, particularly in distributed systems where data cannot be centrally shared among multiple nodes or clients. While Federated Learning is widely adopted in this context, Knowledge Distillation (KD) is emerging as a flexible and scalable alternative where model output is used to share knowledge among distributed clients. However, existing studies often overlook the efficiency and effectiveness of various knowledge transfer strategies in KD, especially in decentralized environments where data is non-IID. This study provides key insights by examining the impact of network topology and distillation strategies in KD-based decentralized learning approaches. Our evaluation spans several dissimilarity measures, including Cross-Entropy, Kullback-Leibler divergence, Triangular Divergence, Jensen-Shannon divergence, Structural Entropic Distance, and Multi-way SED, assessed under both pairwise and holistic distillation schemes. In the pairwise approach, distillation is performed by summing the client-wise dissimilarities between a client’s output and each neighbor’s prediction individually, while the holistic approach computes dissimilarity with respect to the average of the output predictions received from neighboring clients.</div><div>We also analyze performance across client connectivity levels to explore the trade-off between convergence speed and model accuracy. The results indicate that the holistic distillation approach, which averages client predictions, outperforms the sum of pairwise distillation, especially when employing alternative measures like TD, SED, and JS. These measures offer improved performance over conventional metrics such as CE and KL divergence.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"176 ","pages":"Article 108171"},"PeriodicalIF":6.2,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145268853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Dynamic workload balancing in decentralized edge systems: A marginal cost approach","authors":"Emanuele Carlini , Patrizio Dazzi , Luca Ferrucci , Jacopo Massa , Matteo Mordacchini","doi":"10.1016/j.future.2025.108167","DOIUrl":"10.1016/j.future.2025.108167","url":null,"abstract":"<div><div>The rise of edge computing poses resource management issues, especially in decentralized systems where scalability and responsiveness are crucial. This paper introduces a cost-driven framework for collaborative resource management using the marginal computing cost per user. It applies the economic principle of marginal cost to assess edge data centers’ (EDCs) ability to support more users, enabling efficient resource allocation. Simulations with PureEdgeSim and real-world data such as Alibaba Trace demonstrate substantial enhancements in resource use, latency, and active instance reduction, maintaining scalability and adaptability with high user demands.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"176 ","pages":"Article 108167"},"PeriodicalIF":6.2,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145268161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Huikang Huang , Weiwei Lin , Minxian Xu , Keqin Li
{"title":"AQESF: An adaptive QoS-enhanced scheduling framework for online batch of task scheduling","authors":"Huikang Huang , Weiwei Lin , Minxian Xu , Keqin Li","doi":"10.1016/j.future.2025.108174","DOIUrl":"10.1016/j.future.2025.108174","url":null,"abstract":"<div><div>For dynamic cloud environments and diverse user requirements, cloud service providers must adopt efficient scheduling methods to fulfill the quality of service (QoS). However, existing scheduling approaches are still inadequate in dealing with the online batch task scheduling problem in complex cloud environments. Specifically, existing methods do not consider the scheduling order optimization of batch tasks while taking into account long-term cumulative performance and robustness. This paper proposes an Adaptive QoS-Enhanced Scheduling Framework (AQESF) based on the multi-action Proximal Policy Optimization to address this challenge. The AQESF integrates the Deep Reinforcement Learning (DRL) Queue and the Multi-FIFO-Manner modules for joint optimization to cover the task order and task placement solution space. Furthermore, placement decisions are constrained to be solved in a more optimized space based on well-designed greedy algorithms. Extensive experimental evaluations on the Alibaba trace demonstrate that AQESF exhibits superior cumulative performance of average response time and success rate. Furthermore, AQESF exhibits strong robustness and low scheduling latency compared with the common DRL task scheduling paradigm. Finally, we analyze the potential applications of AQESF in VM placement and computation offloading.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"176 ","pages":"Article 108174"},"PeriodicalIF":6.2,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145268855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}