Panagiotis Giannakopoulos , Bart van Knippenberg , Kishor Chandra Joshi , Nicola Calabretta , George Exarchakos
{"title":"perfCorrelate: Performance variability correlation framework","authors":"Panagiotis Giannakopoulos , Bart van Knippenberg , Kishor Chandra Joshi , Nicola Calabretta , George Exarchakos","doi":"10.1016/j.future.2025.107827","DOIUrl":"10.1016/j.future.2025.107827","url":null,"abstract":"<div><div>Edge computing is a promising technology for deploying time-sensitive and privacy-sensitive applications closer to the premises of users. However, it is crucial to identify the sources of performance variability caused by application co-location to meet user requirements effectively. Monitoring systems typically expose hundreds of metrics, making comprehensive analysis challenging. As a result, researchers often rely on a small, arbitrarily selected subset of metrics for tasks such as building performance predictors. In this paper, we examine how the available monitoring metrics are correlated with Round Trip Time (RTT) fluctuations and suggest directions for building performance models. Our experiments focus on a Single Particle Analysis (SPA) applications for an electron microscopy use case, deployed in a Kubernetes environment and monitored by Prometheus. We demonstrate that while a subset of monitoring metrics consistently correlates with performance, the specific metrics in this subset can vary due to dynamic application co-locations and observation windows. Consequently, the optimal number of metrics and the choice of machine learning model needed to accurately capture performance variability vary between different scenarios (co-location and cluster nodes). These differences directly impact the effectiveness of scheduling decisions in resource clusters, which depend on performance predictors. Our work presents a method to systematically identify the most relevant monitoring metrics to changes in RTT and determining the most representative observation window, ensuring a more generalizable understanding of the performance of the application throughout its lifecycle.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"170 ","pages":"Article 107827"},"PeriodicalIF":6.2,"publicationDate":"2025-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143808732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Special issue on big data computing service and machine learning applications","authors":"Katerina Potika , Magdalini Eirinaki , Monica Vitali , Anna Bernasconi , Hiroyuki Fujioka","doi":"10.1016/j.future.2025.107836","DOIUrl":"10.1016/j.future.2025.107836","url":null,"abstract":"<div><div>This Special Issue addresses the evolving landscape of big data generated by sensors, devices, and services. The shift from centralized cloud infrastructures to distributed systems that involve cloud, edge, and Internet of Things (IoT) devices requires innovative approaches to managing and analyzing big data. The key challenges include privacy, security, energy efficiency, data quality, and trust. This Special Issue invited researchers to submit innovative solutions covering topics such as: Big Data Analytics and Machine Learning; Integrated, Heterogeneous, and Distributed Infrastructures for Big Data Management; Big Data Platforms and Technologies; Real-time Big Data Services and Applications; Big Data Security and Privacy Preservation; Big Data Quality and Trust; Trustworthy data sharing; Sustainability and Energy-Efficiency of Big Data; Storage and Computation; Big Data and Analytics for Healthcare; Big Data Applications and Experiences. This initiative expands on discussions from the IEEE Big Data Service (BDS) 2023 conference held in Athens Greece, reaching a broader audience of researchers.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"171 ","pages":"Article 107836"},"PeriodicalIF":6.2,"publicationDate":"2025-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143825001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Prototype-based fine-tuning for mitigating data heterogeneity in federated learning","authors":"Liming Chai, Jun Xie, Nanrun Zhou","doi":"10.1016/j.future.2025.107831","DOIUrl":"10.1016/j.future.2025.107831","url":null,"abstract":"<div><div>In federated learning with data heterogeneity, the global model often exhibits a severe imbalance in fitting data from different categories, and clients may not be able to obtain useful information from the impaired global model. To address this challenge, Federated Learning Based on Model Repair (FedMR) is proposed to repair the global model by a set of prototypes with minimal divergence. The repair step of FedMR is executed after global aggregation and before local training. Different clients first obtain similar local prototypes on the same feature extractor, and then fine-tune the global classifier with these local prototypes. The repaired classifier is aggregated at the server and broadcast to all clients, enabling them to start local training from a consensus point. This approach effectively mitigates the adverse effects of uneven sample distribution. In most experimental configurations, FedMR outperforms the state-of-the-art federated learning algorithms.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"170 ","pages":"Article 107831"},"PeriodicalIF":6.2,"publicationDate":"2025-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143808753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dong Kyu Sung , Sunggon Kim , Sangjin Lee , Houjun Tang , Alex Sim , Kesheng Wu , Suren Byna , Yongseok Son
{"title":"Regen: An object layout regenerator on large-scale production HPC systems","authors":"Dong Kyu Sung , Sunggon Kim , Sangjin Lee , Houjun Tang , Alex Sim , Kesheng Wu , Suren Byna , Yongseok Son","doi":"10.1016/j.future.2025.107830","DOIUrl":"10.1016/j.future.2025.107830","url":null,"abstract":"<div><div>This article proposes an object layout regenerator called Regen which regenerates and removes the object layout dynamically to improve the read performance of applications. Regen first detects frequent access patterns from the I/O requests of the applications. Second, Regen reorganizes the objects and regenerates or preallocates new object layouts according to the identified access patterns. Finally, Regen removes or reuses the obsolete or regenerated object layouts as necessary. As a result, Regen accelerates access to objects by providing a flexible object layout. We implement Regen as a framework on top of Proactive Data Container (PDC) and evaluate it on Cori supercomputer, a production-scale HPC system, by using realistic HPC I/O benchmarks. The experimental results show that Regen improves the I/O performance by up to 16.92<span><math><mo>×</mo></math></span> compared with an existing system.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"171 ","pages":"Article 107830"},"PeriodicalIF":6.2,"publicationDate":"2025-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143838902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Data-loss models for proactive-tolerance Reed–Solomon storage systems","authors":"Jing Li, Zhenrui Zhou, Jianli Ding","doi":"10.1016/j.future.2025.107832","DOIUrl":"10.1016/j.future.2025.107832","url":null,"abstract":"<div><div>Proactive fault tolerance increasingly serves as an added protection for data in Reed–Solomon (RS) systems. Compared with declustered placement, grouped placement reduces the failure units and also decreases the repair parallelism, which have the opposite effect on systems reliability. For a RS (<span><math><mi>k</mi></math></span>, <span><math><mi>m</mi></math></span>) system, the values of (<span><math><mi>k</mi></math></span>, <span><math><mi>m</mi></math></span>) impact storage overhead, fault tolerance and repair traffic. When designing proactive RS storage systems, it is challenging to choose the proper placement scheme and coding scheme.</div><div>This paper presents four general reliability equations for estimating the number of data-loss events and the amount of data loss in proactive RS systems using declustered and grouped placement schemes. These equations model the effect of disk/node failures, repair bandwidth, block errors, disk scrubbing, disk/node failure prediction, stripe placement, and coding scheme on the reliability of systems. Moreover, we design a Monte-Carlo based simulator to analyze the reliability of proactive Reed–Solomon systems. The equational results are in good accord with the simulation results, which demonstrates the effectiveness of our proposed equations. Using these mathematical models, we can easily estimate and compare fault tolerant schemes and placement schemes, learn the effect of system parameters on system reliability, facilitating to maintain and design cloud storage systems.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"170 ","pages":"Article 107832"},"PeriodicalIF":6.2,"publicationDate":"2025-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143792547","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhancing the output of time series forecasting algorithms for cloud resource provisioning","authors":"Ferran Agullo , Alberto Gutierrez-Torre , Jordi Torres , Josep Ll. Berral","doi":"10.1016/j.future.2025.107833","DOIUrl":"10.1016/j.future.2025.107833","url":null,"abstract":"<div><div>Forecasting the resource consumption of workloads is a frequent approach in the cloud provisioning field. Ideally, such predictions allow obtaining a more accurate scheduling and management of resources in a computing cluster. However, the current approaches fail to properly forecast the future consumption in areas where sudden increases of consumption are present, <em>i.e</em>., spikes. Even, commonly employed metrics lack the ability to properly evaluate sharp behaviours in the traces. This may generate resource starvation problems in the running workloads and decreases the Quality of Service (QoS) provided to external users. To address this issue, we propose two strategies that modify the outputs of forecasting algorithms without changing the algorithms’ internals. The new outputs considerably enhance the prediction of sudden increases, duplicating the F1 score metric in average for all tested algorithms. This improvement in the handling of spikes comes with an increased over-provision of resources. Nevertheless, the proposed strategies give the user an easy way to control this trade-off between predicting spikes and the amount of over-provision. The user can decide which is the right balance that better fits the requirements of its specific scenario. Furthermore, we propose a new evaluation methodology that better assesses the behaviour of forecasting algorithms in cloud traces, especially focused on the performance around increases of consumption, and we give insights on the reasons behind the predictions of the algorithms with the application of explainability techniques. The code repository of this work can be accessed through GitHub at this link <span><span>https://github.com/FerranAgulloLopez/ResourceForecasting</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"170 ","pages":"Article 107833"},"PeriodicalIF":6.2,"publicationDate":"2025-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143817083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Unmanned aerial vehicle swarm-assisted reliable federated learning for traffic flow prediction","authors":"Man Zhou , Lansheng Han , Yangyang Geng","doi":"10.1016/j.future.2025.107828","DOIUrl":"10.1016/j.future.2025.107828","url":null,"abstract":"<div><div>Unmanned Aerial Vehicle (UAV) swarms, as efficient and flexible monitoring tools, can collect real-time traffic information over extensive areas. However, UAV swarms engaged in traffic monitoring are vulnerable to network attacks and privacy breaches, leading to data distortion and compromised system performance. To address these security challenges and incentivize UAV participation, we propose CI-AGFL, a federated learning (FL)-based swarm intelligence approach that enables distributed traffic flow prediction through seamless information sharing and fusion between ground vehicles and UAV swarms. In CI-AGFL, ground vehicles train local models, which are then aggregated into a global model by UAV swarms using a robust, decentralized aggregation method grounded in consensus confirmation. Furthermore, a fuzzy membership method is employed to evaluate UAV trustworthiness during the model aggregation phase. Additionally, we introduce a reputation-based multi-dimensional contract theory incentive mechanism to optimize UAV participation in federated learning tasks, dynamically balancing energy consumption with training latency to ensure accurate, real-time traffic flow predictions. Experimental results demonstrate that CI-AGFL outperforms three advanced traffic flow prediction methods, achieving improvements of 8.2% to 22.8% in MAE, MSE, RMSE, and MAPE metrics, while significantly enhancing model convergence.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"170 ","pages":"Article 107828"},"PeriodicalIF":6.2,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143768318","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Performance portability of sparse matrix–vector multiplication implemented using OpenMP, OpenACC and SYCL","authors":"Kinga Stec, Przemysław Stpiczyński","doi":"10.1016/j.future.2025.107825","DOIUrl":"10.1016/j.future.2025.107825","url":null,"abstract":"<div><div>The aim of this paper is to study the performance portability of OpenMP, OpenACC and SYCL implementations of sparse matrix–vector product (SpMV) and its extended version in which the dot product of the input vector and the result is also calculated, for CSR and BSR storage formats, on Intel and AMD CPUs and NVIDIA GPU platforms. We compare it with the performance portability of much more sophisticated implementations provided by the vendors in their Intel oneAPI MKL and NVIDIA cuSPARSE libraries. Using the reformulated performance portability metric <figure><img></figure> we show how it changes for various sparse matrices and which portable implementation and format achieve better performance portability. Numerical experiments show that the considered portable implementations for the CSR format usually achieve better performance than for the BSR format. On GPU, CSR OpenACC implementations for SpMV and SpMV-DOT tend to be the best. On CPU, CSR OpenMP implementation usually gives the best results for SpMV-DOT, while CSR OpenMP and BSR MKL achieve the best results for a similar number of matrices.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"170 ","pages":"Article 107825"},"PeriodicalIF":6.2,"publicationDate":"2025-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143739224","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Njoud O. Al-Maaitah , Javier Garcia-Blas , Genaro Sanchez-Gallegos , Jesus Carretero , Marc-André Vef , André Brinkmann
{"title":"A comparative study of ad-hoc file systems for extreme scale computing","authors":"Njoud O. Al-Maaitah , Javier Garcia-Blas , Genaro Sanchez-Gallegos , Jesus Carretero , Marc-André Vef , André Brinkmann","doi":"10.1016/j.future.2025.107815","DOIUrl":"10.1016/j.future.2025.107815","url":null,"abstract":"<div><div>High-performance computing (HPC) systems often suffer from interference caused by multiple applications accessing a shared parallel file system, which can negatively impact compute performance. One solution to this problem is to add new tiers to the HPC storage hierarchy that can absorb I/O bursts and support moving data between tiers based on its hotness. Ad-hoc file systems serve as an intermediate storage layer that leverages new storage technologies, such as non-volatile random access memory devices and flash-based solid state drives, to provide temporary storage based on application behavior in the HPC environment. A variety of ad-hoc file systems have been proposed recently. In this survey, we will explore the integration of fast storage layers into HPC storage hierarchies. We will examine various ad-hoc file systems highlighting their features and functionalities to categorize the proposed solutions into different groups.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"170 ","pages":"Article 107815"},"PeriodicalIF":6.2,"publicationDate":"2025-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143734798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wei Huang, Andrea Araldo, Hind Castel-Taleb, Badii Jouaber
{"title":"Dimensioning network slices for power minimization under reliability constraints","authors":"Wei Huang, Andrea Araldo, Hind Castel-Taleb, Badii Jouaber","doi":"10.1016/j.future.2025.107824","DOIUrl":"10.1016/j.future.2025.107824","url":null,"abstract":"<div><div>Network slicing allows multiplexing virtualized networks, called <em>slices</em>, over a single physical network infrastructure. Research has extensively focused on the placement of virtual functions and the links that compose each network slice. On the other hand, performance greatly depends on how many resources are allocated to virtual nodes and links, <em>after</em> they are placed. This aspect has been mostly neglected.</div><div>In this paper, we propose a method to dimension computation and network resources to slices, with the aim to minimize dynamic power consumption. Latency and power are the result of non-trivial couplings between different components of each slice. Therefore, minimizing power while satisfying the reliability constraints of all slices is challenging. To capture these couplings, we model slices as multiple Jackson networks (one per slice) co-existing in the same resource-constrained physical network. To the best of our knowledge, we are the first to employ Jackson Networks in such a setting. Dynamic power savings are in large part obtained by finely deciding CPU clock frequency, exploiting Dynamic Voltage Frequency Scaling (DVFS). Via numerical evaluation, we show that our method finds per each slice just the right amount of resources to satisfy latency constraints (expressed in probabilistic terms, as chance-constraints). This brings relevant dynamic power reduction with respect to baselines representing the state of the art in network slicing, which focuses on placement without specific strategies for resources dimensioning.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"170 ","pages":"Article 107824"},"PeriodicalIF":6.2,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143734889","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}