{"title":"State of practice: Evaluating GPU performance of state vector and tensor network methods","authors":"Marzio Vallero, Paolo Rech, Flavio Vella","doi":"10.1016/j.future.2025.107927","DOIUrl":"10.1016/j.future.2025.107927","url":null,"abstract":"<div><div>The frontier of quantum computing (QC) simulation on classical hardware is quickly reaching the hard scalability limits for computational feasibility. Nonetheless, there is still a need to simulate large quantum systems classically, as the Noisy Intermediate Scale Quantum (NISQ) devices are yet to be considered fault tolerant and performant enough in terms of operations per second. Each of the two main exact simulation techniques, state vector and tensor network simulators, boasts specific limitations.</div><div>This article investigates the limits of current state-of-the-art simulation techniques on a test bench made of eight widely used quantum subroutines, each in different configurations, with a special emphasis on performance. We perform both single process and distributed scaleability experiments on a supercomputer. We correlate the performance measures from such experiments with the metrics that characterise the benchmark circuits, identifying the main reasons behind the observed performance trends. Specifically, we perform distributed sliced tensor contractions, and we analyse the impact of pathfinding quality on contraction time, correlating both results with topological circuit characteristics. From our observations, given the structure of a quantum circuit and the number of qubits, we highlight how to select the best simulation strategy, demonstrating how preventive circuit analysis can guide and improve simulation performance by more than an order of magnitude.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"174 ","pages":"Article 107927"},"PeriodicalIF":6.2,"publicationDate":"2025-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144223669","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jun-Jie Li , Zheng-Yi Chai , Ya-Lun Li , Ying-Bo Zhou
{"title":"Reliable and efficient computation offloading for dependency-aware tasks in IIoT using evolutionary multi-objective optimization","authors":"Jun-Jie Li , Zheng-Yi Chai , Ya-Lun Li , Ying-Bo Zhou","doi":"10.1016/j.future.2025.107923","DOIUrl":"10.1016/j.future.2025.107923","url":null,"abstract":"<div><div>Mobile Edge Computing (MEC) significantly enhances the computing and processing capabilities of Industrial Internet of Things (IIoT) systems with hardware resource constraints. However, the complexity of industrial environments and the dependencies between tasks in industrial applications pose new challenges for collaborative edge computing offloading between IIoT devices and MEC systems. In industrial applications, complex workflow tasks can be modeled using Directed Acyclic Graphs (DAGs). The execution order of each task and the offloading decisions within the workflow can result in varying completion times, ultimately affecting the overall Quality of Experience (QoE) of the workflow. Therefore, it is crucial to pay attention to the execution order of tasks and offloading decisions within workflows in MEC-assisted industrial systems. This paper introduces an enhanced multi-objective evolutionary algorithm designed to minimize both the completion time of industrial applications and the energy usage of IIoT devices. Firstly, a retransmission model is incorporated to simulate the phenomenon caused by packet loss in complex industrial environments. Then, a dynamic task scheduling algorithm based on response rate and a delay-based execution position initialization method are designed to achieve optimal scheduling and offloading for DAG applications. Finally, to prevent task failures due to retransmission delays, two optimization strategies for task retransmission mechanisms are proposed. Comprehensive experimental results demonstrate that, under the same computational offloading scenarios, the proposed algorithm achieves better convergence and diversity of non-dominated solutions,while significantly reducing system latency and energy consumption.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"174 ","pages":"Article 107923"},"PeriodicalIF":6.2,"publicationDate":"2025-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144254750","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"GraphOpticon: A Global proactive horizontal autoscaler for improved service performance & resource consumption","authors":"Theodoros Theodoropoulos , Yashwant Singh Patel , Uwe Zdun , Paul Townend , Ioannis Korontanis , Antonios Makris , Konstantinos Tserpes","doi":"10.1016/j.future.2025.107926","DOIUrl":"10.1016/j.future.2025.107926","url":null,"abstract":"<div><div>The increasing complexity of distributed computing environments necessitates efficient resource management strategies to optimize performance and minimize resource consumption. Although proactive horizontal autoscaling dynamically adjusts computational resources based on workload predictions, existing approaches primarily focus on improving workload resource consumption, often neglecting the overhead introduced by the autoscaling system itself. This could have dire ramifications on resource efficiency, since many prior solutions rely on multiple forecasting models per compute node or group of pods, leading to significant resource consumption associated with the autoscaling system. To address this, we propose GraphOpticon, a novel proactive horizontal autoscaling framework that leverages a singular global forecasting model based on Spatio-temporal Graph Neural Networks. The experimental results demonstrate that GraphOpticon is capable of providing improved service performance, and resource consumption (caused by the workloads involved and the autoscaling system itself). As a matter of fact, GraphOpticon manages to consistently outperform other contemporary horizontal autoscaling solutions, such as Kubernetes’ Horizontal Pod Autoscaler, with improvements of 6.62% in median execution time, 7.62% in tail latency, and 6.77% in resource consumption, among others.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"174 ","pages":"Article 107926"},"PeriodicalIF":6.2,"publicationDate":"2025-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144262989","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhehui Wang , Benjamin Chen Ming Choong , Tian Huang , Daniel Gerlinghoff , Rick Siow Mong Goh , Cheng Liu , Tao Luo
{"title":"Is quantum optimization ready? An effort towards neural network compression using adiabatic quantum computing","authors":"Zhehui Wang , Benjamin Chen Ming Choong , Tian Huang , Daniel Gerlinghoff , Rick Siow Mong Goh , Cheng Liu , Tao Luo","doi":"10.1016/j.future.2025.107908","DOIUrl":"10.1016/j.future.2025.107908","url":null,"abstract":"<div><div>Quantum optimization is the most mature quantum computing technology to date, providing a promising approach towards efficiently solving complex combinatorial problems. Methods such as adiabatic quantum computing (AQC) have been employed in recent years on important optimization problems across various domains. In deep learning, deep neural networks (DNN) have reached immense sizes to support new predictive capabilities. Optimization of large-scale models is critical for sustainable deployment, but becomes increasingly challenging with ever-growing model sizes and complexity. While quantum optimization is suitable for solving complex problems, its application to DNN optimization is not straightforward, requiring thorough reformulation for compatibility with commercially available quantum devices. In this work, we explore the potential of adopting AQC for fine-grained pruning-quantization of convolutional neural networks. We rework established heuristics to formulate model compression as a quadratic unconstrained binary optimization (QUBO) problem, and assess the solution space offered by commercial quantum annealing devices. Through our exploratory efforts of reformulation, we demonstrate that AQC can achieve effective compression of practical DNN models. Experiments demonstrate that adiabatic quantum computing (AQC) not only outperforms classical algorithms like genetic algorithms and reinforcement learning in terms of time efficiency but also excels at identifying global optima.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"174 ","pages":"Article 107908"},"PeriodicalIF":6.2,"publicationDate":"2025-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144270675","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mahfooz Alam , Mohammad Shahid , Suhel Mustajab , Mohammad Sajid
{"title":"LSDMA: Levelized security driven deadline constrained multiple workflow allocation model in cloud computing","authors":"Mahfooz Alam , Mohammad Shahid , Suhel Mustajab , Mohammad Sajid","doi":"10.1016/j.future.2025.107941","DOIUrl":"10.1016/j.future.2025.107941","url":null,"abstract":"<div><div>Security and deadline sensitivity play a crucial role in many cloud applications, requiring a high level of security to ensure confidentiality, integrity, and authentication in cross-platform data transmission. Therefore, we designed a levelized security driven deadline constrained multiple workflow allocation (LSDMA) model in order to optimize the risk probability of satisfying the workflow tasks’ security demands and deadline constraints. To ensure the security in cloud platform, three-level security services, i.e., authentication, integrity, and confidentiality, are employed. However, existing secured workflow allocation literature reports very few models for multiple workflows addressing security and deadline requirements. It leaves scope to develop new models in the domain. Further, the completion time of the workflows is also improved by using level-wise allocation, inserting best-fit successor tasks into idle gaps, and adopting a parallel communication mechanism between levels during execution to reduce overall communication overhead. The prototype simulator for the secured workflow allocator is designed and implemented in MATLAB for performance evaluation with the competitive models from the domain. The meticulous simulation results show that the LSDMA model outperforms at both batch of random workflows and real workflows among the state-of-the-art models on the considered QoS parameters under study. The experimental findings indicate that LSDMA surpasses SDS, LBSIR, SAHEFT, SODA, and HEFT regarding the risk probability, achieving improvements of 21 %–73 %, 15 %–74 %, 12 %–72 %, and 14 %–73 % when varying the batch of random, Montage, CyberShake, and LIGO workflows, respectively.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"174 ","pages":"Article 107941"},"PeriodicalIF":6.2,"publicationDate":"2025-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144240644","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exploring the performance of CP2K simulations on the CPU-GPDSP Fusion intra-heterogeneous HPC system","authors":"Qi Du , Feng Wang , Hui Huang","doi":"10.1016/j.future.2025.107912","DOIUrl":"10.1016/j.future.2025.107912","url":null,"abstract":"<div><div>This study explores the performance of CP2K on a heterogeneous HPC system integrating CPU and GPDSP, aiming to optimize computational efficiency for large-scale molecular simulations. CP2K is an open-source software package designed for simulating condensed matter systems, particularly excelling in handling complex quantum chemistry and molecular dynamics workloads. We present the integration of CPU and GPDSP in a heterogeneous processor environment, detailing key optimizations, including vectorization of integral operations in Density Functional Theory (DFT) and GEMM optimization based on processor memory architecture. Furthermore, we propose a parallel computing strategy tailored to the hardware’s architectural characteristics to maximize performance. Benchmarking results using the CP2K test suite demonstrate significant computational and parallel efficiency gains. For instance, in a water molecule simulation, the system achieves 79% parallel efficiency when scaled to 256 compute nodes, utilizing approximately 400,000 cores. Finally, we conduct a comparative performance analysis between CPU-GPDSP and AVX-512 vector processors, highlighting the advantages and potential limitations of GPDSP acceleration in heterogeneous HPC environments.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"174 ","pages":"Article 107912"},"PeriodicalIF":6.2,"publicationDate":"2025-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144178767","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Changwei Liu , Hao Ren , Guoqiang Li , Haojie Ren , Xiaojun Liang , Chunhua Yang , Weihua Gui
{"title":"Singular Value Decomposition-based lightweight LSTM for time series forecasting","authors":"Changwei Liu , Hao Ren , Guoqiang Li , Haojie Ren , Xiaojun Liang , Chunhua Yang , Weihua Gui","doi":"10.1016/j.future.2025.107910","DOIUrl":"10.1016/j.future.2025.107910","url":null,"abstract":"<div><div>Long–short-term memory (LSTM) neural networks are known for their exceptional performance in various domains, particularly in handling time series data and managing long-term dependencies. However, deploying LSTM often faces challenges due to limitations in memory and computational resources, especially in edge computing and real-time processing scenarios. To maximize the advantages of LSTM in resource-constrained environments, this paper presents a lightweight LSTM method that uses weight matrix decomposition. Specifically, it employs Singular Value Decomposition (SVD) to decompose the weight matrices within the LSTM Cell and fully connected layers. Then, an optimization method is addressed to enable the efficient development of a lightweight model by dynamically assessing and enhancing storage and computational efficiency through adjustments of the learning rate and weight parameters. The experimental results indicate that this method reduces the parameters of the LSTM model by 45%, compresses the model size to 45% of its original size, and maintains prediction accuracy without decline. It means that the proposed method based on weight matrix decomposition allows LSTM to operate with less computational power and memory, making them more feasible for deploying resource-constrained devices.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"174 ","pages":"Article 107910"},"PeriodicalIF":6.2,"publicationDate":"2025-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144189488","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"MicroFaaS: Adaptive serverless computing for Internet of Things","authors":"Olgierd Krolik , Tomasz Szydlo","doi":"10.1016/j.future.2025.107914","DOIUrl":"10.1016/j.future.2025.107914","url":null,"abstract":"<div><div>Cloud and edge computing solutions, and especially serverless offerings, are promising areas of technology that can provide additional computing resources to Internet of Things (IoT) devices. This research aims to design and evaluate a novel adaptive computations offloading framework for the IoT domain that leverages serverless Function-as-a-Service (FaaS) solutions capabilities to intelligently select the most suitable execution environment to run the computations in. Pretrained cost estimation models are constructed for each function and each environment (FaaS platform) and they are used by offloading strategies on IoT devices to determine the best execution environment for each invocation. Conducted research demonstrate that pretraining of cost estimation models significantly reduces the time required to calibrate the decision-making offloading algorithm on devices. Evaluation results also prove that it is possible to achieve better function execution times by using offloading algorithms that intelligently select the execution environment for each invocation and can adapt themselves quickly to sudden deterioration of network conditions by monitoring the network state.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"174 ","pages":"Article 107914"},"PeriodicalIF":6.2,"publicationDate":"2025-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144212967","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lei Li , Hongyu Zhang , Meng Mi , Haitao Li , Guodong Lü , Chunhou Zheng , Yansen Su
{"title":"Drug combination prediction for parasitic diseases through information-augmented hypergraph neural network","authors":"Lei Li , Hongyu Zhang , Meng Mi , Haitao Li , Guodong Lü , Chunhou Zheng , Yansen Su","doi":"10.1016/j.future.2025.107913","DOIUrl":"10.1016/j.future.2025.107913","url":null,"abstract":"<div><div>Although drug combination therapies are a well-established strategy in the treatment of parasitic diseases, identifying novel synergistic drug combinations poses a challenge due to the vast combinatorial space involved. Recently, computational approaches have emerged as an efficient and cost-effective means to prioritize combinations for testing. However, the limited availability of known drug combinations for treating parasitic diseases poses a challenge, hindering the training of computational models for accurate predictions. To address the above issue, we propose an information-augmented hypergraph neural network-based computational method named IHGNNDDS to predict potential synergistic drug combinations targeting parasitic diseases. First, the known drug combinations collected from PubChem database are converted into a drug synergy hypergraph. Then, information-augmented hypergraph neural network (IHGNN), consisting of pre-learning augmentation of topology-level and attribute augmentation of semantic-level, is designed to fully explore the existing information in the hypergraph. The features of the drugs and parasitic diseases learned from the hypergraph are concatenated according to the triplet structures and then input into the prediction module for identifying potential synergistic drug combinations targeting parasitic diseases. In the comparison experiments, IHGNNDDS surpasses all baseline methods, demonstrating notable improvements in predictive accuracy. The results of case study prove that IHGNNDDS has the ability to identify potential synergistic drug combinations.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"174 ","pages":"Article 107913"},"PeriodicalIF":6.2,"publicationDate":"2025-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144212968","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pengjuan Liang, Yaling Xun, Jianghui Cai, Haifeng Yang
{"title":"Autoscaling of microservice resources based on dense connectivity spatio-temporal GNN and Q-learning","authors":"Pengjuan Liang, Yaling Xun, Jianghui Cai, Haifeng Yang","doi":"10.1016/j.future.2025.107909","DOIUrl":"10.1016/j.future.2025.107909","url":null,"abstract":"<div><div>Autoscaling technology enables cloud-native systems to adapt to dynamic workload changes by scaling outward or inward without manual intervention. However, when facing sudden and unpredictable workloads, it becomes particularly difficult to determine which services need to be scaled and to assess the amount of resources required, especially for complex time-varying service dependencies that are difficult to accurately quantify. To adaptively and accurately evaluate the resource requirements of different services under dynamic workloads and minimize costs under the constraints of service level agreements (SLAs), a microservice resource autoscaling solution (AGQ) that combines a Spatio-temporal Graph Neural Network (STGNN) based on dense connections with Q-learning is proposed. AGQ models interdependent microservices as a graph structure, integrating real-time monitored resource status data into feature vectors for each node. By introducing the dense connection-based STGNN model, it enhances the ability to capture feature information and facilitates gradient propagation. Then, the dense connection-based STGNN model was introduced to enhance its ability to capture feature information and gradient propagation, for more accurately predicting future resource usage. Finally, reinforcement learning Q-learning is adopted to effectively evaluate scheduling strategies and optimize resource allocation by simultaneously relying on historical experience and the predictions from the STGNN model. The experimental results show that the collaborative optimization strategy AGQ can better adapt to changes in service dependency relationships, more accurately manage resources. AGQ achieve superior cost efficiency and lower SLA violation rate compared to several advanced methods.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"174 ","pages":"Article 107909"},"PeriodicalIF":6.2,"publicationDate":"2025-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144185335","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}