{"title":"Energy efficiency support for software defined networks: a serverless computing approach","authors":"Fatemeh Banaie , Karim Djemame , Abdulaziz Alhindi , Vasilios Kelefouras","doi":"10.1016/j.future.2025.108121","DOIUrl":"10.1016/j.future.2025.108121","url":null,"abstract":"<div><div>Automatic network management strategies have become paramount for meeting the needs of innovative real-time and data-intensive applications, such as those in the Internet of Things. However, the ever-growing and fluctuating demands for data and services in such applications require more than ever an efficient, scalable, and energy-aware network resource management. To address these challenges, this paper introduces a novel approach that leverages a modular architecture based on serverless functions within an energy-aware environment. By deploying SDN services as Functions as a Service (FaaS), the proposed approach enables dynamic, on-demand network function deployment, achieving significant cost and energy savings through fine-grained resource provisioning. Unlike previous monolithic SDN approaches, this work disaggregates SDN control plane into modular, serverless components, transforming tightly integrated functionalities into independent, on-demand services while ensuring performance, scalability, and energy efficiency. An analytical model is presented to approximate the service delivery time and power consumption, as well as an open source prototype implementation supported by an extensive experimental evaluation. Experimental results demonstrate significant improvement in energy efficiency compared to traditional approaches, highlighting the potential of this approach for sustainable network environments.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"176 ","pages":"Article 108121"},"PeriodicalIF":6.2,"publicationDate":"2025-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145107952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bálint Siklósi , Pushpender K. Sharma , David J. Lusher , István Z. Reguly , Neil D. Sandham
{"title":"Reduced and mixed precision turbulent flow simulations using explicit finite difference schemes","authors":"Bálint Siklósi , Pushpender K. Sharma , David J. Lusher , István Z. Reguly , Neil D. Sandham","doi":"10.1016/j.future.2025.108111","DOIUrl":"10.1016/j.future.2025.108111","url":null,"abstract":"<div><div>The use of reduced and mixed precision computing has gained increasing attention in high-performance computing (HPC) as a means to improve computational efficiency, particularly on modern hardware architectures like GPUs. In this work, we explore the application of mixed precision arithmetic in compressible turbulent flow simulations using explicit finite difference schemes. We extend the OPS and OpenSBLI frameworks to support customizable precision levels, enabling fine-grained control over precision allocation for different computational tasks. Through a series of numerical experiments on the Taylor–Green vortex benchmark, we demonstrate that mixed precision strategies, such as half-single and single-double combinations, can offer significant performance gains without compromising numerical accuracy. However, pure half-precision computations result in unacceptable accuracy loss, underscoring the need for careful precision selection. Our results show that mixed precision configurations can reduce memory usage and communication overhead, leading to notable speedups, particularly on multi-CPU and multi-GPU systems.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"175 ","pages":"Article 108111"},"PeriodicalIF":6.2,"publicationDate":"2025-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145048436","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Management of autoscaling serverless functions in edge computing via Q-Learning","authors":"Priscilla Benedetti , Mauro Femminella , Gianluca Reali","doi":"10.1016/j.future.2025.108112","DOIUrl":"10.1016/j.future.2025.108112","url":null,"abstract":"<div><div>Serverless computing is a recently introduced deployment model to provide cloud services. The autoscaling of function instances allows adapting allocated resources to workload, so as to reduce latency and improve resource usage efficiency. However, autoscaling mechanisms could be affected by undesired ‘cold starts’ events, causing latency peaks due to spawning of new instances, which can be critical in edge deployments where applications are typically sensitive to latency. In order to regulate autoscaling of functions and mitigate the latency for accessing services, which may hinder the adoption of the serverless model in edge computing, we resort to the usage of reinforcement learning. Our experimental system is based on OpenFaaS, the most popular open-source Kubernetes-based serverless platform. In this system, we introduce a Q-Learning (QL) agent to dynamically configure the Kubernetes Horizontal Pod Autoscaler (HPA). This is accomplished via a QL model state space and a reward function definition that enforce service level agreement (SLA) compliance, in terms of latency, without allocating excessive resources. The agent is trained and tested using real serverless function invocation patterns, made available by Microsoft Azure. The experimental results show the benefits provided by the proposed solution over state-of-the-art in terms of compliance to the SLA, while limiting resource consumption and service request losses.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"175 ","pages":"Article 108112"},"PeriodicalIF":6.2,"publicationDate":"2025-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145048967","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Def-Ag: An energy-efficient decentralized federated learning framework via aggregator clients","authors":"Junyoung Park , Sungpil Woo , Joohyung Lee","doi":"10.1016/j.future.2025.108114","DOIUrl":"10.1016/j.future.2025.108114","url":null,"abstract":"<div><div>Federated Learning (FL) has revolutionized Artificial Intelligence (AI) by enabling decentralized model training across diverse datasets, thereby addressing privacy concerns. However, traditional FL relies on a centralized server, leading to latency, single-point failures, and trust issues. Decentralized Federated Learning (DFL) emerges as a promising solution, but it faces challenges in achieving optimal accuracy and convergence due to limited client interactions, requiring energy inefficiency. Moreover, balancing the personalization and generalization of the AI model in DFL remains a complex issue. To address those challenging problems, this paper presents Def-Ag, an innovative energy-efficient DFL framework utilizing aggregator clients within similarity-based clusters. To reduce this signaling overhead, a partial model information exchange is proposed in intra-cluster training. In addition, the knowledge distillation method is applied for inter-cluster training to carefully incorporate the knowledge between clusters. Finally, by integrating clustering-based hierarchical DFL and optimizing client selection, Def-Ag reduces energy consumption and communication overhead while balancing personalization and generalization. Extensive experiments on CIFAR-10 and FMNIST datasets confirm Def-Ag’s superior performance in reducing energy usage and maintaining learning accuracy compared to baseline methods. The results demonstrate that Def-Ag effectively balances personalization and generalization, providing a robust solution for energy-efficient decentralized federated learning systems.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"175 ","pages":"Article 108114"},"PeriodicalIF":6.2,"publicationDate":"2025-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145048965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An efficient two-stage computing method for large-scale research interest mining","authors":"Sha Yuan , Zhou Shao","doi":"10.1016/j.future.2025.108117","DOIUrl":"10.1016/j.future.2025.108117","url":null,"abstract":"<div><div>Semantic analysis for academic data is crucial for many scientific services, such as review recommendation, planning research funding directions. Research interest analysis faces challenges in large-scale academic data mining. Traditional methods of representing research interests, such as manual labeling, using statistical or machine learning methods, have limitations. In particular, the computation amount is unacceptable in large-scale multisource information integration. This paper presents an efficient computing method for predicting scholar interests based on the principle of large-scale recommendation systems, consisting of rough and refined sorting. In rough sorting, one-hot encoding, CHI square feature selection, TF-IDF feature extraction, and an SGD-based classifier are used to obtain several top interest labels. In refined sorting, a pre-trained SciBERT model outputs the optimal interest labels. The proposed approach offers two main advantages. Firstly, it improves computational efficiency, as directly using pre-trained models like BERT for large-scale data leads to excessive calculations. Secondly, the algorithm ensures better model performance. Feature selection in the rough sorting stage can avoid the negative impact of irrelevant papers on prediction precision, which is a problem when using pre-trained model directly.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"175 ","pages":"Article 108117"},"PeriodicalIF":6.2,"publicationDate":"2025-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145057285","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yinuo Fan , Dawei Sun , Minghui Wu , Shang Gao , Rajkumar Buyya
{"title":"A fine-grained task scheduling strategy for resource auto-scaling over fluctuating data streams","authors":"Yinuo Fan , Dawei Sun , Minghui Wu , Shang Gao , Rajkumar Buyya","doi":"10.1016/j.future.2025.108119","DOIUrl":"10.1016/j.future.2025.108119","url":null,"abstract":"<div><div>Resource scaling is crucial for stream computing systems in fluctuating data stream scenarios. Computational resource utilization fluctuates significantly with changes in data stream rates, often leading to pronounced issues of resource surplus and scarcity within these systems. Existing research has primarily focused on addressing resource insufficiency at runtime; however, effective solutions for handling variable data streams remain limited. Furthermore, overlooking task communication dependencies during task placement in resource adjustment may lead to increased communication cost, consequently impairing system performance. To address these challenges, we propose Ra-Stream, a fine-grained task scheduling strategy for resource auto-scaling over fluctuating data streams. Ra-Stream not only dynamically adjusts resources to accommodate varying data streams, but also employs fine-grained scheduling to optimize system performance further. This paper explains Ra-Stream through the following aspects: (1) Formalization: We formalize the application subgraph partitioning problem, the resource scaling problem and the task scheduling problem by constructing and analyzing a stream application model, a communication model, and a resource model. (2) Resource scaling and heuristic partitioning: We propose a resource scaling algorithm to scale computational resource for adapting to fluctuating data streams. A heuristic subgraph partitioning algorithm is also introduced to minimize communication cost evenly. (3) Fine-grained task scheduling: We present a fine-grained task scheduling algorithm to minimize computational resource utilization while reducing communication cost through thread-level task deployment. (4) Comprehensive evaluation: We evaluate multiple metrics, including latency, throughput and resource utilization in a real-world distributed stream computing environment. Experimental results demonstrate that, compared to state-of-the-art approaches, Ra-Stream reduces system latency by 36.37 % to 47.45 %, enhances system maximum throughput by 26.2 % to 60.55 %, and saves 40 % to 46.25 % in resource utilization.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"175 ","pages":"Article 108119"},"PeriodicalIF":6.2,"publicationDate":"2025-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145048437","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ciro Giuseppe De Vita , Gennaro Mellone , Diana Di Luccio , Javier Garcia-Blas , Francesca Barchiesi , Raffaele Montella
{"title":"A coupled Lagrangian-AI hierarchical and heterogeneous model for predicting bacteria contamination in farmed mussels","authors":"Ciro Giuseppe De Vita , Gennaro Mellone , Diana Di Luccio , Javier Garcia-Blas , Francesca Barchiesi , Raffaele Montella","doi":"10.1016/j.future.2025.108108","DOIUrl":"10.1016/j.future.2025.108108","url":null,"abstract":"<div><div>The quality of coastal waters, particularly aquaculture zones, is crucial to sustainable development and human health. Traditional monitoring methods based on scheduled in-situ sampling are often too slow, costly, and limited in spatial and temporal coverage to meet the needs of large-scale aquaculture management. To overcome these constraints, we introduce the <em>Artificial Intelligence-based Water QUAlity Plus Plus model</em> (AIQUAM++), an AI-based modeling framework designed to predict E. coli contamination directly within farmed mussels. We evaluated and compared a suite of recent and high-performing machine learning architectures, such as K-Nearest Neighbors (KNN), and several Transformer-based architectures (Transformer, Informer, Reformer, TimesNet), to address the complex temporal dependencies within this time series classification (TSC) task. AIQUAM++ was trained with historical microbiological measures of E. coli levels in the mussels provided by the local authorities involved in food safety monitoring. The system architecture, featuring an inference engine completely written in C++ for high performance, leverages hierarchical parallelism to ensure scalability and computational efficiency, incorporating Message Passing Interface (MPI) for inter-process communication on multi-core architectures, OpenMP for multithreaded processing, and CUDA-based acceleration for GPU-optimized computations. This design enables high-throughput inference that is suitable for operational deployment in aquaculture monitoring networks. A test case application of AIQUAM++ was conducted in the Gulf of Naples (Campania, Italy). Empirical results demonstrated that the proposed system achieves classification accuracies that exceed 90 %, supporting its efficacy as a real-time data-driven decision support tool for aquaculture water quality management, minimizing health risks and contributing to sustainable marine resource governance.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"176 ","pages":"Article 108108"},"PeriodicalIF":6.2,"publicationDate":"2025-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145107953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dongmin Li , Xiurui Xie , Dongyang Zhang , Athanasios V. Vasilakos , Man-Fai Leung
{"title":"SEMQ: Efficient non-uniform quantization with sensitivity-based error minimization for large language models","authors":"Dongmin Li , Xiurui Xie , Dongyang Zhang , Athanasios V. Vasilakos , Man-Fai Leung","doi":"10.1016/j.future.2025.108120","DOIUrl":"10.1016/j.future.2025.108120","url":null,"abstract":"<div><div>Large Language Models (LLMs) represent a pivotal breakthrough in computational intelligence, showcasing exceptional capabilities in information aggregation and reasoning. However, their remarkable performance comes at the cost of ultra-high-scale parameters, leading to significant resource demands during deployment. Therefore, various model compression techniques have been developed, such as pruning, distillation, and quantization. Among these, quantization has gained prominence due to its ability to directly reduce the precision of model weights and activations, resulting in substantial memory savings and accelerated inference. Despite its advantages, existing quantization approaches face substantial challenges in ultra-low precisions (e.g., 2-bit), often resulting in severe performance degradation. To tackle this challenge, we propose a novel non-uniform quantization with minimal disturbance for LLM, which contains two innovations: (i) a Sensitivity-based Error Minimization Non-Uniform Quantization (SEMQ) algorithm, which finds the quantization scheme to minimize the quantization error through continuous iteration; and (ii) a Z-score-based method for outlier detection and isolation under the normal distribution assumption, reducing the complexity of the quantization process. The extensive experiments on the LLaMA family demonstrates that the proposed SEMQ enables the ultra-low precision quantization up to 2-bit, and 10<span><math><mo>×</mo></math></span> GPU memory reduction for origin LLMs while maintaining the model accuracy. Our code is publicly available at <span><span>https://github.com/ldm2060/semq</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"175 ","pages":"Article 108120"},"PeriodicalIF":6.2,"publicationDate":"2025-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145048438","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yanfen Zhang , Longxin Zhang , Buqing Cao , Jing Liu , Wenyu Zhao , Jianguo Chen , Keqin Li
{"title":"Adaptive-oriented mutation snake optimizer for scheduling budget-constrained workflows in heterogeneous cloud environments","authors":"Yanfen Zhang , Longxin Zhang , Buqing Cao , Jing Liu , Wenyu Zhao , Jianguo Chen , Keqin Li","doi":"10.1016/j.future.2025.108118","DOIUrl":"10.1016/j.future.2025.108118","url":null,"abstract":"<div><div>Cloud computing, recognized as an advanced computing paradigm, facilitates flexible and efficient resource management and service delivery through virtualization and resource sharing. However, the computational capabilities of resources in heterogeneous cloud environments are often correlated with their costs; thus, budget constraints are imposed on users who require rapid response times. We introduce a novel metaheuristic optimization algorithm called the snake optimizer (SO), which is aimed at workflow scheduling in cloud environments, to tackle the challenge mentioned. We also integrate random mutation to enhance the algorithm’s global search capability to overcome the limitation of SO’s being prone to local optima. Additionally, we aim to increase the success rate of finding feasible solutions within budget constraints; thus, we implement a directional strategy to guide the evolutionary paths of the snake individuals. In this context, excessive randomness and overly rigid directionality can adversely affect the algorithm’s search performance. We propose an adaptive-oriented mutation (AOM) mechanism to balance the two aspects mentioned. This AOM mechanism is integrated with SO to create AOM-SO, which effectively addresses the makespan minimization problem for workflow scheduling under budget constraints in heterogeneous cloud environments. Comparative experiments using real-world scientific workflows show that AOM-SO achieves a 100 % success rate in identifying feasible solutions. Moreover, compared with the state-of-the-art algorithms, it reduces makespan by an average of 43.03 %.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"175 ","pages":"Article 108118"},"PeriodicalIF":6.2,"publicationDate":"2025-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145048962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"ContraMST: A unified framework for dynamic MST maintenance","authors":"Akanksha Dwivedi, Dip Sankar Banerjee","doi":"10.1016/j.future.2025.108097","DOIUrl":"10.1016/j.future.2025.108097","url":null,"abstract":"<div><div>Dynamic graphs, characterized by frequent changes in their topological structure through the addition or deletion of edges or vertices, present significant challenges for algorithm design. This work introduces <em>ContraMST</em>, a suite of algorithms for efficiently processing dynamic graphs in a batched setting. We employ a tree contraction mechanism to create a hierarchical representation of the input graph, facilitating the identification of localized updates. This approach enables the maintenance of critical graph primitives, such as the minimum spanning tree (MST), without requiring recomputation from scratch. Experimental results demonstrate the effectiveness of <em>ContraMST</em> on real-world graphs, where batch-dynamic algorithms are crucial for efficiently handling updates in different batch processing scenarios.</div><div>Specifically, our technique highlights <em>ContraMST’s</em> performance across various update scenarios: IMB (Incremental), DMB (Decremental), and FDM (Fully Batch Dynamic) MST. For IMB, we demonstrate experimental validations on GPUs, where our proposed technique achieves up to 3.43<span><math><mo>×</mo></math></span> speedup compared to equivalent parallel implementations on shared-memory CPUs. Additionally, it provides up to 4.23<span><math><mo>×</mo></math></span> speedup over conventional parallel computation from scratch. For DMB, experimental results show that <em>ContraMST</em> achieves up to 4.98<span><math><mo>×</mo></math></span> speedup on GPUs compared to equivalent parallel implementations on shared-memory CPUs, with an additional 5.12<span><math><mo>×</mo></math></span> speedup over conventional parallel computation from scratch. For FDM, our experimental validations demonstrate that <em>ContraMST</em> achieves up to 6.56<span><math><mo>×</mo></math></span> speedup on GPUs over shared-memory CPU implementations and up to 7.31<span><math><mo>×</mo></math></span> speedup compared to conventional parallel computation from scratch. This significant improvement is attributed to <em>ContraMST’s</em> ability to process IMB and DMB operations together, reducing redundant computations and fully utilizing GPU parallelism. These results underscore <em>ContraMST’s</em> efficiency in managing dynamic graph updates in a batch setting, leveraging GPU parallelism to enhance performance across all update scenarios.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"176 ","pages":"Article 108097"},"PeriodicalIF":6.2,"publicationDate":"2025-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145107954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}