Frédéric Suter , Tainã Coleman , İlkay Altintaş , Rosa M. Badia , Bartosz Balis , Kyle Chard , Iacopo Colonnelli , Ewa Deelman , Paolo Di Tommaso , Thomas Fahringer , Carole Goble , Shantenu Jha , Daniel S. Katz , Johannes Köster , Ulf Leser , Kshitij Mehta , Hilary Oliver , J.-Luc Peterson , Giovanni Pizzi , Loïc Pottier , Rafael Ferreira da Silva
{"title":"A terminology for scientific workflow systems","authors":"Frédéric Suter , Tainã Coleman , İlkay Altintaş , Rosa M. Badia , Bartosz Balis , Kyle Chard , Iacopo Colonnelli , Ewa Deelman , Paolo Di Tommaso , Thomas Fahringer , Carole Goble , Shantenu Jha , Daniel S. Katz , Johannes Köster , Ulf Leser , Kshitij Mehta , Hilary Oliver , J.-Luc Peterson , Giovanni Pizzi , Loïc Pottier , Rafael Ferreira da Silva","doi":"10.1016/j.future.2025.107974","DOIUrl":"10.1016/j.future.2025.107974","url":null,"abstract":"<div><div>The term “scientific workflow” has evolved over the last two decades to encompass a broad range of compositions of interdependent compute tasks and data movements. It has also become an umbrella term for processing in modern scientific applications. Today, many scientific applications can be considered as workflows made of multiple dependent steps, and hundreds of workflow systems have been developed to manage and run these scientific workflows. However, no turnkey solution has emerged from the field to address the diversity of scientific processes and the infrastructure on which they are supposed to be implemented. Instead, new research problems requiring the execution of scientific workflows with some novel feature often lead to the development of an entirely new workflow system. A direct consequence of this situation is that many existing workflow management systems (WMSs) share some salient features, offer similar functionalities, and can manage the same categories of workflows but at the same time also have some distinct capabilities that can be important for specific applications. This situation makes researchers who develop workflows face the complex question of selecting a WMS. This selection can be driven by technical considerations, to find the system that is the most appropriate for their application and for the computing and storage resources available to them, or other factors such as reputation, adoption, strong community support, or long-term sustainability. To address this problem, a group of WMS developers and practitioners joined their efforts to produce a community-based terminology of WMSs. This paper summarizes their findings and introduces this new terminology to characterize WMSs. This terminology is composed of fives axes: workflow structure and characteristics, composition, orchestration, data management, and metadata capture. Each axis comprises several concepts that capture the prominent features of WMSs. Based on this terminology, this paper also presents a classification of 23 existing WMSs according to the proposed axes and terms.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"174 ","pages":"Article 107974"},"PeriodicalIF":6.2,"publicationDate":"2025-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144480933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Diogo Landau , Ingeborg de Pater , Mihaela Mitici , Nishant Saurabh
{"title":"Federated learning framework for collaborative remaining useful life prognostics: An aircraft engine case study","authors":"Diogo Landau , Ingeborg de Pater , Mihaela Mitici , Nishant Saurabh","doi":"10.1016/j.future.2025.107945","DOIUrl":"10.1016/j.future.2025.107945","url":null,"abstract":"<div><div>Complex systems such as aircraft engines are continuously monitored by sensors. In predictive aircraft maintenance, the collected sensor measurements are used to estimate the health condition and the Remaining Useful Life (RUL) of such systems. However, a major challenge when developing prognostics is the limited number of run-to-failure data samples. This challenge could be overcome if multiple airlines would share their run-to-failure data samples such that sufficient learning can be achieved. Due to privacy concerns, however, airlines are reluctant to share their data in a centralized setting. In this paper, a collaborative federated learning framework is therefore developed instead. Here, several airlines cooperate to train a collective RUL prognostic machine learning model, without the need to centrally share their data. For this, a decentralized validation procedure is proposed to validate the prognostics model without sharing any data. Moreover, sensor data is often noisy and of low quality. This paper therefore proposes four novel methods to aggregate the parameters of the global prognostic model. These methods enhance the robustness of the FL framework against noisy data. The proposed framework is illustrated for training a collaborative RUL prognostic model for aircraft engines, using the N-CMAPSS dataset. Here, six airlines are considered, that collaborate in the FL framework to train a collective RUL prognostic model for their aircraft’s engines. When comparing the proposed FL framework with the case where each airline independently develops their own prognostic model, the results show that FL leads to more accurate RUL prognostics for five out of the six airlines. Moreover, the novel robust aggregation methods render the FL framework robust to noisy data samples.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"174 ","pages":"Article 107945"},"PeriodicalIF":6.2,"publicationDate":"2025-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144480934","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Consolidation of virtual machines to reduce energy consumption of data centers by using ballooning, sharing and swapping mechanisms","authors":"Simon Lambert , Eddy Caron , Laurent Lefevre , Rémi Grivel","doi":"10.1016/j.future.2025.107968","DOIUrl":"10.1016/j.future.2025.107968","url":null,"abstract":"<div><div>Data centers have major environmental impacts due to their energy consumption and the manufacturing of equipment. They emit greenhouse gases and consume energy and resources, such as rare earth and water. Efficient computing resource management is therefore a key challenge for Cloud service providers today as they need to meet a growing demand while limiting the oversizing of their infrastructures. Mechanisms derived from virtualization, such as Virtual Machines (VMs) consolidation, are used to optimize resource management and infrastructure sizing, but economic and technical constraints can hinder their adoption. They require prior infrastructure knowledge and usage study to evaluate their potential, involve complex placement algorithms, and are sometimes difficult to implement in hypervisors. In this paper, we propose <em>ORCA (OuR Consolidation Algorithm)</em>, a complete consolidation methodology designed to facilitate the production implementation of such mechanisms. This methodology includes the study of VM usage, the use of prediction models, and a VM placement algorithm that takes advantage of resource oversubscription. The choice of relevant oversubscription ratios is also addressed, with a focus on memory overcommitment through the study of memory overcommitment mechanisms:ballooning, page sharing, and swapping. Results from a detailed simulation process and deployment on a production infrastructure are presented. The methodology is tested in simulation on two production infrastructure datasets, with power consumption reduction as high as 29.8% and without consolidation error. The production deployment using VMWare vSphere and considering fault tolerance requirements reduces the energy consumption by 6.12% without causing any performance degradation.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"174 ","pages":"Article 107968"},"PeriodicalIF":6.2,"publicationDate":"2025-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144337834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tao Zhang , Yunsheng Liu , Haotian Jing , Siyuan Fan , Haozhi Tang , Xidao Luan
{"title":"Balancing data center traffic load with speeding up flow-transmission","authors":"Tao Zhang , Yunsheng Liu , Haotian Jing , Siyuan Fan , Haozhi Tang , Xidao Luan","doi":"10.1016/j.future.2025.107972","DOIUrl":"10.1016/j.future.2025.107972","url":null,"abstract":"<div><div>To boost network transmission performance thus benefiting the service quality of distributed cloud applications, modern data center networks achieve super high bisection bandwidth for data transmissions by offering multiple end-to-end parallel paths. Nonetheless, numerous existing data center load balancing (DCLB) schemes only concern how to improve path utilization thereby mitigating congestion hot-spots, while are naturally agnostic to data center traffic pattern and the diverse requirements on flow-transmission, leading to the sub-optimal network transmission performance. This paper introduces <em>Packet Cloning Load Balancing</em> (PCLB) for achieving efficient data center flow-transmission. PCLB selectively generates <em>Clone Packets</em> (each clone packet is an exact copy of its original data packet) to endow traffic with more rerouting opportunities by considering both flow-transmission phases and path states, thus helping heterogeneous data center flows choose more appropriate paths to speed up their data transmission. Experimental results of numerous simulations show that PCLB significantly reduces the average and tail flow completion time for delay-sensitive flows, while the performance of throughput-oriented flows can always be maintained at a high level.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"174 ","pages":"Article 107972"},"PeriodicalIF":6.2,"publicationDate":"2025-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144337833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Mining one-off high average utility episodes for process event logs","authors":"Zhihong Dong, Jing Liu, Youxi Wu","doi":"10.1016/j.future.2025.107938","DOIUrl":"10.1016/j.future.2025.107938","url":null,"abstract":"<div><div>High utility episode (HUE) mining is an emerging and highly popular research field within data mining, where the aim is to mine all episodes with utility no less than a user-specified threshold. It has been successfully applied in various domains. However, existing research on HUE does not consider the length of episodes, and it must be guaranteed that the utility of each event remains unchanged, which prevents the compatibility of HUE with real-life applications. To tackle this problem, we argue that there is a need to mine one-off high average utility episodes (MAUE). An approach called MAUE-Miner is presented, which has three main modules: database reconstruction, candidate episode generation, and average utility calculation. For the reconstructed database, we present a sequence extraction (SeqExtraction) strategy, which can improve the efficiency of searching for occurrences of episodes. Since MAUE mining does not satisfy the anti-monotonicity property, we introduce a pruning strategy based on the upper bound utility of an episode, which can prune unpromising candidate episodes in advance. To calculate the average utility, depth-first search and backtracking strategies for the event index are adopted, a method that can efficiently find occurrences by avoiding a linear search. Experimental results indicate that MAUE-Miner achieves better performance than alternative methods. More importantly, a case study shows that MAUE-Miner can be applied to a real industrial log to identify paths with high numbers of rejected parts, to discover optimal production processes, and to provide recommendations for further improvement. The code can be downloaded from <span><span>https://github.com/wuc567/Pattern-Mining/tree/master/MAUE-Miner</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"174 ","pages":"Article 107938"},"PeriodicalIF":6.2,"publicationDate":"2025-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144337837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"MPGP-QOC: Multi-programming and graph-partition-based QOC for QNN inference","authors":"Yiding Liu","doi":"10.1016/j.future.2025.107966","DOIUrl":"10.1016/j.future.2025.107966","url":null,"abstract":"<div><div>Quantum neural networks (QNNs) in quantum computing hold promise for transforming machine learning, potentially offering advantages over classical computers. Their unique properties and quantum parallelism open avenues for exploring enhanced computational capabilities in the quantum domain, presenting intriguing opportunities for advancing machine learning applications. However, building efficient QNNs inference remains a challenge due to the high computational complexity, the difficulty in optimizing the quantum circuits, and the underutilization of current quantum hardware. To address these challenges, we introduce a new QNNs inference framework named Multi-Programming and Graph- Partition-based Quantum Optimal Control (MPGP-QOC) that combines parallelization, quantum optimal control and graph partitioning with parameterized quantum circuits. Our framework is designed to be compatible with existing quantum software and hardware platforms, making it easy to implement and experimentally validate. We demonstrate the effectiveness of our framework by applying it to several quantum machine learning classification tasks with different QNNs configurations. Our experimental results show that our MPGP-QOC achieves significant speedup by 10.1<span><math><mo>×</mo></math></span> (up to 10.4<span><math><mo>×</mo></math></span>) over the state-of-the-art QNNs inference, with notable compilation reduction (up to 6.5<span><math><mo>×</mo></math></span>) while maintaining a comparable level of accuracy (up to 6% improvement)</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"174 ","pages":"Article 107966"},"PeriodicalIF":6.2,"publicationDate":"2025-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144337835","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Edge-assisted U-shaped split federated spatial–temporal attention GCN for traffic flow prediction","authors":"Fujie Ren, Detian Liu, Haibin Liu, Yang Cao","doi":"10.1016/j.future.2025.107965","DOIUrl":"https://doi.org/10.1016/j.future.2025.107965","url":null,"abstract":"Traffic flow prediction (TFP) is a critical task for mitigating congestion and improving transportation efficiency. Effective prediction models must balance accuracy, privacy preservation, and resource efficiency in decentralized environments. To meet these requirements, we propose EUFed-STAGCN, an edge-assisted federated learning framework with U-shaped split architecture for privacy-preserving traffic flow prediction. The core innovations of our work are: (1) Spatial–temporal attention-enhanced GCN: A lightweight backbone model, STAGCN, is introduced, consisting of multiple configurable ST-blocks that integrate spatial and temporal attention mechanisms to selectively capture complex spatial–temporal dynamics across traffic networks. The number of ST-blocks can be adjusted based on the computational capacity of edge nodes, enhancing flexibility and scalability in practical deployments. (2) U-shaped split federated training: The model adopts a split training architecture where the input and output layers of the STAGCN model are executed locally on edge devices, while the intermediate ST-blocks are offloaded to edge servers for forward and backward propagation. This design reduces client-side overhead while maintaining full model expressiveness. Additionally, a topology perturbation mechanism based on differential privacy is employed to protect shared structural information during topology aggregation. Extensive experiments on both simulated and real-world settings validate the effectiveness of EUFed-STAGCN. The model achieves competitive prediction accuracy compared to centralized and federated baselines, improves training efficiency under resource constraints, and maintains robust performance across heterogeneous devices and data—making it well suited for real-world intelligent transportation systems.","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"51 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2025-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144337839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SPADE: Simulator-assisted Performability Design for UAV-based monitoring systems","authors":"Qingyang Zhang , Fumio Machida , Ermeson Andrade","doi":"10.1016/j.future.2025.107967","DOIUrl":"10.1016/j.future.2025.107967","url":null,"abstract":"<div><div>As Uncrewed Aerial Vehicles (UAV) have been used widely in a variety of real-world monitoring applications, quality design of UAV-based monitoring systems becomes an emergent challenge as it involves complex trade-offs among several performance criteria. While analytical models have been used for performance analysis of UAV systems, they often rely on hypothetical parameter values due to difficulty in accessing real-world systems, resulting in a gap between theory and practice. To fill this gap, this paper proposes <em>SPADE</em> (Simulator-assisted PerformAbility Design methodology for UAV-based Systems), an approach that integrates performance profiling with a realistic flight scenario generated by a UAV simulator and model-based performance analysis. We demonstrate the application of SPADE through a case study that focuses on designing a UAV-based ecological monitoring system using an object detection algorithm (YOLOv5). Our analysis explores key trade-offs among several quality metrics, including detection accuracy, performance, energy consumption, and service availability. Using Stochastic Petri Nets, we conduct numerical evaluations, with baseline parameter values estimated from performance profiling on an emulated computing device. Experimental results using YOLOv5 provide valuable insights into how image resolution and computation modes impact UAV-based system performance and availability. These findings offer practical guidance for improving UAV system design.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"174 ","pages":"Article 107967"},"PeriodicalIF":6.2,"publicationDate":"2025-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144337838","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rodrigo Moreira , Rafael Pasquini , Joberto S.B. Martins , Tereza C. Carvalho , Flávio de Oliveira Silva
{"title":"AI-driven orchestration at scale: Estimating service metrics on national-wide testbeds","authors":"Rodrigo Moreira , Rafael Pasquini , Joberto S.B. Martins , Tereza C. Carvalho , Flávio de Oliveira Silva","doi":"10.1016/j.future.2025.107971","DOIUrl":"10.1016/j.future.2025.107971","url":null,"abstract":"<div><div>Network Slicing (NS) realization requires AI-native orchestration architectures to efficiently and intelligently handle heterogeneous user requirements. To achieve this, network slicing is evolving towards a more user-centric digital transformation, focusing on architectures that incorporate native intelligence to enable self-managed connectivity in an integrated and isolated manner. However, these initiatives face the challenge of validating their results in production environments, particularly those utilizing ML-enabled orchestration, as they are often tested in local networks or laboratory simulations. This paper proposes a large-scale validation method using a network slicing prediction model to forecast latency using Deep Neural Networks (DNNs) and basic ML algorithms embedded within an NS architecture evaluated in real large-scale production testbeds. It measures and compares the performance of different DNNs and ML algorithms, considering a distributed database application deployed as a network slice over two large-scale production testbeds. The investigation highlights how AI-based prediction models can enhance network slicing orchestration architectures and presents a seamless, production-ready validation method as an alternative to fully controlled simulations or laboratory setups.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"174 ","pages":"Article 107971"},"PeriodicalIF":6.2,"publicationDate":"2025-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144337836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Michał Orzechowski, Łukasz Opioła, Ignacio Lamata Martínez, Marinos Ioannides, Panayiotis N. Panayiotou, Łukasz Dutka, Renata G. Słota, Jacek Kitowski
{"title":"Integrated data, metadata, and paradata management system for 3D Digital Cultural Heritage objects: Workflow automation, federated authentication, and publication","authors":"Michał Orzechowski, Łukasz Opioła, Ignacio Lamata Martínez, Marinos Ioannides, Panayiotis N. Panayiotou, Łukasz Dutka, Renata G. Słota, Jacek Kitowski","doi":"10.1016/j.future.2025.107964","DOIUrl":"https://doi.org/10.1016/j.future.2025.107964","url":null,"abstract":"The complexity of high-quality 3D digitised cultural heritage objects creates challenges for existing data management systems as they need to develop metadata management and processing capabilities to provide semantic insight into the interconnectivity of data that constitutes cultural heritage objects. To address these challenges, we propose a global federated authentication and authorisation mechanism, a data and metadata management system, and an integrated engine for designing and executing automated workflows that facilitate the processing of both data and metadata. The solution is evaluated with three distinct 3D digitised cultural objects and presents the complete process from data upload to cultural heritage object publication.","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"15 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2025-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144337840","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}