{"title":"Efficient edge-based data integrity auditing in cloud storage","authors":"Hao Yan , Yan Wang , Guoxiu Liu , Juan Zhao","doi":"10.1016/j.future.2025.107899","DOIUrl":"10.1016/j.future.2025.107899","url":null,"abstract":"<div><div>Edge computing increasingly collaborates with cloud computing to support numerous applications that involve large data volumes and frequent data interactions. In cloud-edge collaboration environments, applications especially with high requirements for low data transmission delay often deploy frequently accessed client data replicas on edge servers to improve data access efficiency. Consequently, client data is often distributed across both cloud and edge servers in practice. Therefore, efficiently verifying the integrity of all client data poses a complex and urgent challenge. To address this issue, the paper introduces a novel data integrity auditing scheme capable of efficiently performing asynchronous integrity checks on client data across both edge and cloud servers. In our scheme, clients only generate partial block tags and upload them along with the data to the edge server. Edge server computes complete tags based on the partial tags, caches a small portion of frequently accessed data, and transfers the remaining data to the cloud server. For data verification, edge servers provide partial integrity proofs for cached data, supporting the cloud server to generate complete proofs for all challenged data. Thus, the auditors can verify all client data, regardless of its storage location. In our scheme, edge clients bear only about half of the computational workload of existing schemes. Additionally, the cloud server also offloads a portion of computational and storage tasks to edge servers, significantly improving the overall efficiency of data checking. We theoretically prove the security of our scheme, and experimental results demonstrate its efficiency and feasibility.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"172 ","pages":"Article 107899"},"PeriodicalIF":6.2,"publicationDate":"2025-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143936896","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improving self-supervised vertical federated learning with contrastive instance-wise similarity and dynamical balance pool","authors":"Shuai Chen , Wenyu Zhang , Xiaoling Huang , Cheng Zhang , Qingjun Mao","doi":"10.1016/j.future.2025.107884","DOIUrl":"10.1016/j.future.2025.107884","url":null,"abstract":"<div><div>Vertical Federated Learning (VFL) enables multiple parties with distinct feature spaces to train a joint VFL model collaboratively without exposing their original private data. In realistic scenarios, the scarcity of aligned and labeled samples among collaborating participants limits the effectiveness of traditional VFL approaches for model training. Current VFL frameworks attempt to leverage abundant unlabeled data using Contrastive Self-Supervised Learning (CSSL). However, the simplistic incorporation of CSSL methods cannot address severe domain shift in VFL. In addition, CSSL methods typically conflict with general regularization approaches designed to alleviate domain shift, thereby significantly limiting the potential of the self-supervised learning framework in VFL. To address these challenges, this study proposes an Improved Self-Supervised Vertical Federated Learning (ISSVFL) framework for VFL in label-scarce scenarios under the semi-honest and no-collusion assumption. ISSVFL merges CSSL with instance-wise similarity to resolve regularization conflicts and captures more significant inter-domain knowledge in the representations from different participants, effectively alleviating domain shift. In addition, a new dynamical balance pool is proposed to fine-tune the pre-trained models for downstream supervised tasks by dynamically balancing inter-domain and intra-domain knowledge. Extensive empirical experiments on image and tabular datasets demonstrate that ISSVFL achieves an average performance improvement of 3.3 % compared with state-of-the-art baselines.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"172 ","pages":"Article 107884"},"PeriodicalIF":6.2,"publicationDate":"2025-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143931576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sangmyung Lee , Byungyoon Lee , Yongseok Son , Kiwook Sohn , Hwajung Kim , Sunggon Kim
{"title":"AS2: Adaptive sorting algorithm selection for heterogeneous workloads and systems","authors":"Sangmyung Lee , Byungyoon Lee , Yongseok Son , Kiwook Sohn , Hwajung Kim , Sunggon Kim","doi":"10.1016/j.future.2025.107860","DOIUrl":"10.1016/j.future.2025.107860","url":null,"abstract":"<div><div>Sorting is becoming increasingly important in modern computing, ranging from small-scale Internet of Things (IoT) devices to supercomputers. To improve sorting performance, various algorithms, including Intro sort, Merge sort, Heap sort, and Insertion sort, are adopted in different systems. However, the performance of sorting algorithms depends on various factors, and our analysis shows that the optimal algorithm varies, with no single algorithm consistently outperforming the others. In this paper, we first analyze data internal factors (data size, distribution, data type) and external factors (threads, different hardware) that impact sorting algorithm performance. We utilize widely adopted sorting algorithms such as STL sort and Merge sort, as well as state-of-the-art sorting algorithms like Ips4o sort and Aips2o sort. In addition to sequential sorting algorithms, we implement Parallel Intro sort and utilize the parallel versions of state-of-the-art sorting algorithms with varying number of threads. From the analysis, we present an adaptive sorting algorithm selection model for heterogeneous workloads and systems, called AS2 (Adaptive Sorting Algorithm Selection). Its goal is to determine the optimal algorithm from the existing sorting algorithms in heterogeneous workloads and systems. AS2 uses various ML models to build performance models for each sorting algorithm using data internal and external factors from various datasets. Then, AS2 chooses the optimal sorting algorithm based on the performance prediction using the model. We evaluate AS2 using a representative dataset that includes various data internal and external factors. The results show that AS2 can accurately predict the performance of various sorting algorithms, with min and max r-squared values of 0.83 and 0.99, respectively. In addition, AS2 successfully selects the optimal algorithm in our evaluation scenario up to 99.68% accuracy by choosing the algorithm with the shortest predicted sorting time, improving performance by up to 1.83<span><math><mo>×</mo></math></span> compared to the state-of-the-art algorithm. We also evaluate the performance of AS2 using the real-world dataset and the results show that AS2 selects the optimal algorithm with 87.50% accuracy.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"172 ","pages":"Article 107860"},"PeriodicalIF":6.2,"publicationDate":"2025-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143918585","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Han Li , Shunmei Meng , Jin Sun , Zhicheng Cai , Qianmu Li , Xuyun Zhang
{"title":"Multi-agent deep reinforcement learning based multi-task partial computation offloading in mobile edge computing","authors":"Han Li , Shunmei Meng , Jin Sun , Zhicheng Cai , Qianmu Li , Xuyun Zhang","doi":"10.1016/j.future.2025.107861","DOIUrl":"10.1016/j.future.2025.107861","url":null,"abstract":"<div><div>Mobile edge computing (MEC) can enhance the computation performance of end-devices by providing computation offloading service at the network edge. However, given that both end-devices and edge servers have finite computation resources, inefficient offloading policies may lead to overload, thereby increasing the computation delays of tasks. In this paper, we investigate a multi-task partial computation offloading problem combined with a queue model. Based on achieving load-balancing across the MEC system, our objective is to minimize the long-standing average task-processing cost of the end-devices while ensuring the delay thresholds of tasks. For this purpose, a distributed offloading algorithm utilizing the multi-agent deep reinforcement learning (MADRL) method is proposed. Specifically, through interacting with the MEC environment and accumulating experience data, the device agents can collaborate to optimize their local offloading decisions over continuous time-slots, which includes adjusting the transmission power and determining the tasks’ offloading ratios under the dynamic wireless channel conditions. Exhaustive experimental results demonstrate that in contrast with the baseline algorithms, the proposed offloading algorithm can not only better balance the computation loads between the end-devices and the MEC server, but also more effectively reduce the task-processing cost of the end-devices, as well as the percentage of timeout tasks.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"172 ","pages":"Article 107861"},"PeriodicalIF":6.2,"publicationDate":"2025-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143903692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Harnessing quality-throughput trade-off in scoring functions for extreme-scale virtual screening campaigns","authors":"Yuedong Zhang, Gianmarco Accordi, Davide Gadioli, Gianluca Palermo","doi":"10.1016/j.future.2025.107863","DOIUrl":"10.1016/j.future.2025.107863","url":null,"abstract":"<div><div>Drug discovery is a long and costly process aimed at finding a molecule that yields a therapeutic effect. Virtual screening is one of the initial in-silico steps that aims at estimating how promising a molecule is. This stage needs to solve two well-known domain problems: molecular docking and scoring. While the accuracy of scoring functions is extensively investigated in comparisons, the execution time of their implementation is usually not considered. In virtual screening campaigns, the definition of a fixed time budget for the entire process and the average time required to process each molecule determines the upper limit of the number of molecules that can be evaluated. By reducing the time needed to evaluate a single molecule, we can screen a larger number of molecules, thereby increasing the possibility of finding a promising solution. For extreme-scale virtual screening campaigns, the computational budget is a critical aspect since even utilizing large-scale facilities would make it impractical to complete the screening within a feasible time unless the computational time for a single molecule is significantly reduced.</div><div>In this paper, we explore optimization and approximation techniques applied to two well-known scoring functions, which we modify to investigate different accuracy-performance trade-offs to support large-scale virtual screening campaigns. Despite the different approaches we considered, experimental results demonstrate that the proposed enhancements achieve better enrichment factors in virtual screening scenarios. Moreover, we port both implementations to CUDA to show that the proposed techniques are GPU-friendly and aligned with modern supercomputing infrastructures.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"172 ","pages":"Article 107863"},"PeriodicalIF":6.2,"publicationDate":"2025-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143918584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiaoyue Feng, Sijia Zhang, Tianzhe Jiao, Chaopeng Guo, Jie Song
{"title":"Adaptive container auto-scaling for fluctuating workloads in cloud","authors":"Xiaoyue Feng, Sijia Zhang, Tianzhe Jiao, Chaopeng Guo, Jie Song","doi":"10.1016/j.future.2025.107872","DOIUrl":"10.1016/j.future.2025.107872","url":null,"abstract":"<div><div>Database-as-a-Service(DBaaS) provides services for multiple tenants through resource containers, which are allowed to scale over time to fulfill the service-level agreements. Designing container auto-scaling methods for DBaaS can help reduce their expenditure. Reinforcement Learning (RL) shows powerful performance in cloud resource scaling due to its robustness in dynamic environments. However, the RL-based methods fail to maintain high performance for fluctuating workloads since their fixed-action design cannot adapt to numerous variations of the resource demand. This paper proposes an adaptive container auto-scaling method called Asner that includes an improved RL-based algorithm with a dynamic action model to solve the problem of fixed-action design. Asner consists of a resource estimation model (<em>Estimator</em>) and a RL-based scaling algorithm (<em>Scaler</em>). <em>Estimator</em> adopts a graph-based method to estimate the workload resource demand for container scaling. <em>Scaler</em> generates the container scaling strategy by employing an improved RL-based algorithm with a dynamic action model for adapting to the fluctuating workload. Our experiment results show that <em>Estimator</em> achieves about 93% accuracy under the TPC-DS dataset, <em>Scale</em>’s performance is about 30% higher than the state-of-the-art RL, and Asner improves its performance by up to 45% compared to other methods.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"172 ","pages":"Article 107872"},"PeriodicalIF":6.2,"publicationDate":"2025-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143903691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ami Marowka, Przemysław Stpiczyński, Roman Wyrzykowski
{"title":"Special issue on advances in techniques for assessment performance portability of HPC applications","authors":"Ami Marowka, Przemysław Stpiczyński, Roman Wyrzykowski","doi":"10.1016/j.future.2025.107876","DOIUrl":"10.1016/j.future.2025.107876","url":null,"abstract":"<div><div>This special issue aims to present new developments and advances in techniques for assessment performance portability of high performance computing applications. It contains revised and extended versions of selected papers presented at the 10th Workshop on Language-Based Parallel Programming Models, WLPP 2024, which was a part of 15th International Conference on Parallel Processing and Applied Mathematics, PPAM 2024, held on September 8–11, 2024, in Ostrava, Czech Republic.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"171 ","pages":"Article 107876"},"PeriodicalIF":6.2,"publicationDate":"2025-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143888099","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shangyin Weng, Yan Gou, Lei Zhang, Muhammad Ali Imran
{"title":"Evaluating privacy loss in differential privacy based federated learning","authors":"Shangyin Weng, Yan Gou, Lei Zhang, Muhammad Ali Imran","doi":"10.1016/j.future.2025.107848","DOIUrl":"10.1016/j.future.2025.107848","url":null,"abstract":"<div><div>Federated learning (FL) trains a global model by aggregating local training gradients, but private information can be leaked from these gradients. To enhance privacy, differential privacy (DP) is often used by adding artificial noise. However, this approach reduces accuracy compared to noise-free learning. Balancing privacy protection and model accuracy remains a key challenge for DP-based FL. Additionally, current methods use theoretical bounds to measure privacy loss, lacking an intuitive assessment. In this paper, we first propose an evaluation method for privacy leakage in the FL by utilizing reconstruction attacks to analyze the difference between the original images and reconstructed ones. We then formulate the problems of investigating DP’s effect on the reconstruction attack, where we study the accumulative privacy loss under two different reconstruction attack settings and prove that anonymous local clients can decrease the probability of privacy leakage. Next, we study the effects of different clipping methods, including fixed constants and the median value of the unclipped gradients’ norm, on privacy protection and learning performance. Furthermore, we derive the theoretical convergence analysis for the cosine similarity and <span><math><msub><mrow><mi>l</mi></mrow><mrow><mn>2</mn></mrow></msub></math></span>-norm-based reconstruction attack under DP noise. We conduct extensive simulations to show how DP settings affect privacy leakage and characterize the trade-off between privacy protection and learning accuracy.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"172 ","pages":"Article 107848"},"PeriodicalIF":6.2,"publicationDate":"2025-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143895584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alexander Vodyaho , Radhakrishnan Delhibabu , Dmitry I. Ignatov , Nataly Zhukova
{"title":"Run time dynamic digital twins and dynamic digital twins networks","authors":"Alexander Vodyaho , Radhakrishnan Delhibabu , Dmitry I. Ignatov , Nataly Zhukova","doi":"10.1016/j.future.2025.107823","DOIUrl":"10.1016/j.future.2025.107823","url":null,"abstract":"<div><div>Digital twins are widely used for building various types of cyber–physical systems. There are a huge number of publications devoted to the use of digital twins in production systems. Much less attention is paid to the issues of building runtime digital twins. The article describes an approach to building complex distributed cyber–physical systems with a high level of architectural dynamics built on fog and edge computing platforms based on the use of digital twins. The issues of implementing runtime digital twins and distributed systems of runtime digital twins are considered. The requirements to runtime digital twins are defined. Typical problem statements for constructing and maintaining a runtime digital twin system are formulated. A reference architecture of a dynamic runtime digital twin is proposed, which includes a model of the observed system (or the object) and a model processor. The dynamic model of the observed and managed system is considered as a key element of the digital twin. Possible approaches to the synthesis of built-in models of runtime digital twins are discussed. Examples of using the proposed approach to solve practical problems are given. The described approach may be of interest to specialists involved in research and development of various types of information systems implemented on Internet of Things platforms, such as smart cities, smart transport, medical information systems, etc. It is proposed to conduct further research and development in the areas of creating human digital twins.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"172 ","pages":"Article 107823"},"PeriodicalIF":6.2,"publicationDate":"2025-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143928704","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Juan Aznar Poveda , Maximilian Franz Ebner , Thomas Fahringer , Zahra Najafabadi Samani , Marlon Etheredge , Stefan Pedratscher , Nishant Saurabh
{"title":"SmartKV: A cost-effective and low-latency geo-distributed key-value store for the computing continuum","authors":"Juan Aznar Poveda , Maximilian Franz Ebner , Thomas Fahringer , Zahra Najafabadi Samani , Marlon Etheredge , Stefan Pedratscher , Nishant Saurabh","doi":"10.1016/j.future.2025.107857","DOIUrl":"10.1016/j.future.2025.107857","url":null,"abstract":"<div><div>Many data-intensive and distributed applications rely on low-latency and scalable key–value storage systems across the Computing Continuum. Key–value storage systems typically use consistent hashing or hash slot-sharding mechanisms to distribute data across storage nodes, which ensures load balancing but often leads to sub-optimal response times and monetary costs, particularly in geo-distributed systems where nodes might have different unit prices and be widely dispersed. In this paper, we propose <span>SmartKV</span>, a cost-efficient geo-distributed key–value store that optimizes data placement dynamically, abstracting the intricacies of data organization, transfer, access, and processing. <span>SmartKV</span> integrates a decentralized data placement algorithm that optimizes the replication factor and selects suitable locations for key–value pairs and replicas, balancing cost and access latency while keeping optimization overhead low. We employ a realistic cost model based on public and private Cloud and Edge providers that consider data transfer, request, and storage costs. In addition to conventional key–value pairs, <span>SmartKV</span> supports active key–value pairs, which enable the definition of custom data types and the execution of user-defined functions directly on the storage side. This contributes to reducing data transfer costs and round-trip times. We thoroughly evaluate <span>SmartKV</span> across different regions of the Chameleon testbed using several realistic workloads. Results show that the utilized decentralized data placement strategy allows <span>SmartKV</span> to reduce round trip times between 9 and 84% while reducing costs up to 4.84<span><math><mo>×</mo></math></span> under different client workloads and consistency models compared to state-of-the-art data placement strategies.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"171 ","pages":"Article 107857"},"PeriodicalIF":6.2,"publicationDate":"2025-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143878899","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}