{"title":"Unmanned aerial vehicle swarm-assisted reliable federated learning for traffic flow prediction","authors":"Man Zhou , Lansheng Han , Yangyang Geng","doi":"10.1016/j.future.2025.107828","DOIUrl":"10.1016/j.future.2025.107828","url":null,"abstract":"<div><div>Unmanned Aerial Vehicle (UAV) swarms, as efficient and flexible monitoring tools, can collect real-time traffic information over extensive areas. However, UAV swarms engaged in traffic monitoring are vulnerable to network attacks and privacy breaches, leading to data distortion and compromised system performance. To address these security challenges and incentivize UAV participation, we propose CI-AGFL, a federated learning (FL)-based swarm intelligence approach that enables distributed traffic flow prediction through seamless information sharing and fusion between ground vehicles and UAV swarms. In CI-AGFL, ground vehicles train local models, which are then aggregated into a global model by UAV swarms using a robust, decentralized aggregation method grounded in consensus confirmation. Furthermore, a fuzzy membership method is employed to evaluate UAV trustworthiness during the model aggregation phase. Additionally, we introduce a reputation-based multi-dimensional contract theory incentive mechanism to optimize UAV participation in federated learning tasks, dynamically balancing energy consumption with training latency to ensure accurate, real-time traffic flow predictions. Experimental results demonstrate that CI-AGFL outperforms three advanced traffic flow prediction methods, achieving improvements of 8.2% to 22.8% in MAE, MSE, RMSE, and MAPE metrics, while significantly enhancing model convergence.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"170 ","pages":"Article 107828"},"PeriodicalIF":6.2,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143768318","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Performance portability of sparse matrix–vector multiplication implemented using OpenMP, OpenACC and SYCL","authors":"Kinga Stec, Przemysław Stpiczyński","doi":"10.1016/j.future.2025.107825","DOIUrl":"10.1016/j.future.2025.107825","url":null,"abstract":"<div><div>The aim of this paper is to study the performance portability of OpenMP, OpenACC and SYCL implementations of sparse matrix–vector product (SpMV) and its extended version in which the dot product of the input vector and the result is also calculated, for CSR and BSR storage formats, on Intel and AMD CPUs and NVIDIA GPU platforms. We compare it with the performance portability of much more sophisticated implementations provided by the vendors in their Intel oneAPI MKL and NVIDIA cuSPARSE libraries. Using the reformulated performance portability metric <figure><img></figure> we show how it changes for various sparse matrices and which portable implementation and format achieve better performance portability. Numerical experiments show that the considered portable implementations for the CSR format usually achieve better performance than for the BSR format. On GPU, CSR OpenACC implementations for SpMV and SpMV-DOT tend to be the best. On CPU, CSR OpenMP implementation usually gives the best results for SpMV-DOT, while CSR OpenMP and BSR MKL achieve the best results for a similar number of matrices.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"170 ","pages":"Article 107825"},"PeriodicalIF":6.2,"publicationDate":"2025-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143739224","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Njoud O. Al-Maaitah , Javier Garcia-Blas , Genaro Sanchez-Gallegos , Jesus Carretero , Marc-André Vef , André Brinkmann
{"title":"A comparative study of ad-hoc file systems for extreme scale computing","authors":"Njoud O. Al-Maaitah , Javier Garcia-Blas , Genaro Sanchez-Gallegos , Jesus Carretero , Marc-André Vef , André Brinkmann","doi":"10.1016/j.future.2025.107815","DOIUrl":"10.1016/j.future.2025.107815","url":null,"abstract":"<div><div>High-performance computing (HPC) systems often suffer from interference caused by multiple applications accessing a shared parallel file system, which can negatively impact compute performance. One solution to this problem is to add new tiers to the HPC storage hierarchy that can absorb I/O bursts and support moving data between tiers based on its hotness. Ad-hoc file systems serve as an intermediate storage layer that leverages new storage technologies, such as non-volatile random access memory devices and flash-based solid state drives, to provide temporary storage based on application behavior in the HPC environment. A variety of ad-hoc file systems have been proposed recently. In this survey, we will explore the integration of fast storage layers into HPC storage hierarchies. We will examine various ad-hoc file systems highlighting their features and functionalities to categorize the proposed solutions into different groups.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"170 ","pages":"Article 107815"},"PeriodicalIF":6.2,"publicationDate":"2025-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143734798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wei Huang, Andrea Araldo, Hind Castel-Taleb, Badii Jouaber
{"title":"Dimensioning network slices for power minimization under reliability constraints","authors":"Wei Huang, Andrea Araldo, Hind Castel-Taleb, Badii Jouaber","doi":"10.1016/j.future.2025.107824","DOIUrl":"10.1016/j.future.2025.107824","url":null,"abstract":"<div><div>Network slicing allows multiplexing virtualized networks, called <em>slices</em>, over a single physical network infrastructure. Research has extensively focused on the placement of virtual functions and the links that compose each network slice. On the other hand, performance greatly depends on how many resources are allocated to virtual nodes and links, <em>after</em> they are placed. This aspect has been mostly neglected.</div><div>In this paper, we propose a method to dimension computation and network resources to slices, with the aim to minimize dynamic power consumption. Latency and power are the result of non-trivial couplings between different components of each slice. Therefore, minimizing power while satisfying the reliability constraints of all slices is challenging. To capture these couplings, we model slices as multiple Jackson networks (one per slice) co-existing in the same resource-constrained physical network. To the best of our knowledge, we are the first to employ Jackson Networks in such a setting. Dynamic power savings are in large part obtained by finely deciding CPU clock frequency, exploiting Dynamic Voltage Frequency Scaling (DVFS). Via numerical evaluation, we show that our method finds per each slice just the right amount of resources to satisfy latency constraints (expressed in probabilistic terms, as chance-constraints). This brings relevant dynamic power reduction with respect to baselines representing the state of the art in network slicing, which focuses on placement without specific strategies for resources dimensioning.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"170 ","pages":"Article 107824"},"PeriodicalIF":6.2,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143734889","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Towards sustainable smart cities: Workflow scheduling in cloud of health things (CoHT) using deep reinforcement learning and moth flame optimization for edge–cloud systems","authors":"Mustafa Ibrahim Khaleel","doi":"10.1016/j.future.2025.107821","DOIUrl":"10.1016/j.future.2025.107821","url":null,"abstract":"<div><div>In smart cities, the Cloud of Health Things (CoHT) enhances service delivery and optimizes task scheduling and allocation. As CoHT systems proliferate and offer a range of services with varying Quality of Service (QoS) demands, servers face the challenge of efficiently distributing limited virtual machines across internet-based applications. This can strain performance, particularly for latency-sensitive healthcare applications, resulting in increased delays. Edge computing mitigates this issue by bringing computational, storage, and network resources closer to the data source, working in tandem with cloud computing. Combining edge and cloud computing is essential for improving efficiency, especially for IoT-driven tasks where reliability and low latency are vital concerns. This paper introduces an intelligent task scheduling and allocation model that leverages the Moth Flame Optimization (MFO) algorithm, integrated with deep reinforcement learning (DRL), to optimize edge–cloud computing in sustainable smart cities. The model utilizes a bi-class neural network to classify tasks, ensuring rapid convergence while delivering both local and globally optimal solutions, achieving efficient resource allocation, and enhancing QoS. The model was trained on real-world and synthesized cluster datasets, including the Google cluster dataset, to learn cloud-based job scheduling, which is then applied in real-time. Compared with DRL and non-DRL approaches, the model shows significant performance gains, with a 76.2% reduction in latency, an 81.9% increase in reliability, a 74.4% improvement in resource utilization, and an 83.1% enhancement in QoS.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"170 ","pages":"Article 107821"},"PeriodicalIF":6.2,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143739223","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"MinCache: A hybrid cache system for efficient chatbots with hierarchical embedding matching and LLM","authors":"Keihan Haqiq , Majid Vafaei Jahan , Saeede Anbaee Farimani , Seyed Mahmood Fattahi Masoom","doi":"10.1016/j.future.2025.107822","DOIUrl":"10.1016/j.future.2025.107822","url":null,"abstract":"<div><div>Large Language Models (LLMs) have emerged as powerful tools for various natural language processing tasks such as multi-agent chatbots, but their computational complexity and resource requirements pose significant challenges for real-time chatbot applications. Caching strategies can alleviate these challenges by reducing redundant computations and improving response times. In this paper, we propose MinCache, a novel hybrid caching system tailored for LLM applications. Our system employs a hierarchical cache strategy for string retrieval, performing exact match lookups first, followed by resemblance matching, and finally resorting to semantic matching to deliver the most relevant information. MinCache combines the strengths of Least Recently Used (LRU) cache and string fingerprints caching techniques, leveraging MinHash algorithm for fast the <em>resemblance</em> matching. Additionally, Mincache leverage a sentence-transformer for estimating <em>semantics</em> of input prompts. By integrating these approaches, MinCache delivers high cache hit rates, faster response delivery, and improved scalability for LLM applications across diverse domains. Our experiments demonstrate a significant acceleration of LLM applications by up to <span>4.5X</span> against GPTCache as well as improvements in accurate cache hit rate. We also discuss the scalability of our proposed approach across medical domain chat services.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"170 ","pages":"Article 107822"},"PeriodicalIF":6.2,"publicationDate":"2025-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143714393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zeyuan Wang , Yang Liu , Guirong Liang , Cheng Zhong , Feng Yang
{"title":"Dynamic class-balanced threshold Federated Semi-Supervised Learning by exploring diffusion model and all unlabeled data","authors":"Zeyuan Wang , Yang Liu , Guirong Liang , Cheng Zhong , Feng Yang","doi":"10.1016/j.future.2025.107820","DOIUrl":"10.1016/j.future.2025.107820","url":null,"abstract":"<div><div>Federated Semi-Supervised Learning (FSSL) aims to train models based on federated learning using a small amount of labeled data and a large amount of unlabeled data. The limited labeled data and the issue of non-independent and identically distributed (non-IID) data are the major challenges faced by FSSL. Most of the previous methods use traditional fixed thresholds to filter out high-confidence samples and assign pseudo-labels to them without considering low-confidence samples. These methods then increase the sample space by random sampling and other techniques to address the challenges of FSSL. However, the performance of these models remains unsatisfactory. To tackle these challenges, we propose DDRFed, a novel FSSL framework that effectively utilizes all available data by integrating a diffusion model and dynamic class balance thresholds. Specifically, we first mitigate the client-side non-IID issue by utilizing a dataset generated by a client-side co-trained diffusion model that conforms to the global data distribution. The local clients then use the global class distribution information provided by the server to establish dynamic class balance thresholds, which distinguish between high-confidence and low-confidence samples. The existence of dynamic thresholds ensures a sufficient amount of labeled data during the training process. Meanwhile, to fully leverage the knowledge contained in low-confidence samples, we optimize the model’s performance through residual class negative learning. Experiments conducted on two natural datasets demonstrate the superiority of DDRFed, addressing both major challenges in FSSL.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"170 ","pages":"Article 107820"},"PeriodicalIF":6.2,"publicationDate":"2025-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143704950","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A distributed identity management and cross-domain authentication scheme for the Internet of Things","authors":"Miaomiao Wang, Ze Wang","doi":"10.1016/j.future.2025.107818","DOIUrl":"10.1016/j.future.2025.107818","url":null,"abstract":"<div><div>Reliable identity management and authentication are prerequisites for secure information communication. Traditional centralized schemes rely on the Certificate Authority (CA), and their cross-domain authentication is complex, posing a risk of centralized data leakage. The advancement of blockchain technology has disrupted the traditional model, leading to the emergence of Self-Sovereign Identity (SSI) management and authentication schemes. However, the widespread adoption of SSI still faces some challenges, such as key loss and the inefficiency of MerkleTree verification. Therefore, we propose an improved distributed identity management and cross-domain authentication scheme for the Internet of Things (IoT). In this scheme, a key creation and recovery mechanism is first proposed to prevent identity unavailability caused by key loss. Then, a double one-way accumulator algorithm is designed to improve identity authentication and enhance the authentication efficiency. Our scheme has passed formal and informal security analyses, and has robust performance.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"169 ","pages":"Article 107818"},"PeriodicalIF":6.2,"publicationDate":"2025-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143681465","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"RAANMF: An adaptive sequence feature representation method for predictions of protein thermostability, PPI, and drug–target interaction","authors":"Qunfang Yan, Shuyi Pan, Zhixing Cheng, Yanrui Ding","doi":"10.1016/j.future.2025.107819","DOIUrl":"10.1016/j.future.2025.107819","url":null,"abstract":"<div><div>The effective representation of sequence is essential for analyzing protein structure and function. Sequence representation based on reduced amino acids plays an important part in protein research, as it preserves key sequence features while simplifying feature processing. However, it is a challenge to select an appropriate reduced amino acid method for various downstream analysis tasks. Developing reduced amino acid methods that can adapt to various downstream tasks is essential to promote protein-related researches. In this paper, we propose a novel reduced amino acid method based on non-negative matrix factorization (NMF) named RAANMF, which can adaptively generate the reduced amino acid schemes for different tasks. Through validating the effectiveness and universality of RAANMF on three mainstream tasks including protein thermostability prediction, protein–protein interaction prediction, and drug–target interaction prediction, the results demonstrate that the reconstructed models using RAANMF to characterize amino acid sequences can achieve comparable or superior predictive performance with greatly reduced feature dimensions compared to the original models. Moreover, the interpretability of RAANMF which is analyzed from the perspective of the non-negative matrix clustering principle helps us understand the biological significance and enhances its credibility and utility in practical applications. As a method developed from NMF, RAANMF offers a straightforward and interpretable approach for extracting latent features, and it is expected to help study the relation of protein sequence, structure and function.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"169 ","pages":"Article 107819"},"PeriodicalIF":6.2,"publicationDate":"2025-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143681470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Prediction model of performance–energy trade-off for CFD codes on AMD-based cluster","authors":"Marcin Lawenda , Łukasz Szustak , László Környei","doi":"10.1016/j.future.2025.107810","DOIUrl":"10.1016/j.future.2025.107810","url":null,"abstract":"<div><div>This work explores the importance of performance–energy correlation for CFD codes, highlighting the need for sustainable and efficient use of clusters. The prime goal includes the optimisation of selecting and predicting the optimal number of computational nodes to reduce energy consumption and/or improve calculation time. In this work, the utilisation cost of the cluster, measured in core-hours, is used as a crucial factor in energy consumption and selecting the optimal number of computational nodes. The work is conducted on the cluster with AMD EPYC Milan-based CPUs and OpenFOAM application using the Urban Air Pollution model. In order to investigate performance–energy correlation on the cluster, the <span>CVOPTS</span> (Core VOlume Points per TimeStep) metric is introduced, which allows a direct comparison of the parallel efficiency for applications in modern HPC architectures. This metric becomes essential for evaluating and balancing performance with energy consumption to achieve cost-effective hardware configuration. The results were confirmed by numerous tests on a 40-node cluster, considering representative grid sizes. Based on the empirical results, a prediction model was derived that takes into account both the computational and communication costs of the simulation. The research reveals the impact of the AMD EPYC architecture on superspeedup, where performance increases superlinearly with the addition of more computational resources. This phenomenon enables a priori the prediction of performance–energy trade-offs (computing-faster or energy-save setups) for a specific application scenario, through the utilisation of varying quantities of computing nodes.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"169 ","pages":"Article 107810"},"PeriodicalIF":6.2,"publicationDate":"2025-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143681466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}