Kun Cai;Quanwang Wu;Mengchu Zhou;Chao Chen;Junhao Wen;Shouguang Wang
{"title":"Dynamically Scheduling Deadline-Constrained Interleaved Workflows on Heterogeneous Computing Systems","authors":"Kun Cai;Quanwang Wu;Mengchu Zhou;Chao Chen;Junhao Wen;Shouguang Wang","doi":"10.1109/TSC.2025.3536317","DOIUrl":"https://doi.org/10.1109/TSC.2025.3536317","url":null,"abstract":"Heterogeneous computing systems are extensively utilized to execute a wide range of time-critical services, which encompass numerous interdependent tasks organized in the form of workflows. In practice, the dynamic arrival of workflows often interleaves with their execution, leading to resource contention among multiple workflows and potentially causing QoS (Quality of Service) degradation. However, compared to the extensive research on single workflow scheduling, interleaved workflow scheduling has received relatively less attention. Moreover, the challenge of effectively scheduling limited computing resources to promptly complete consecutively arriving workflows remains underexplored, despite its practical importance. To fill this gap, this work proposes a method called Urgency-based List Scheduling (ULS) for dynamically scheduling deadline-constrained interleaved workflows. In ULS, a novel task property called urgency is introduced to prioritize tasks from multiple workflows by capturing real-time execution information, and each newly arrived workflow is scheduled with the outstanding tasks of prior workflows based on a list-based strategy to make more informed decisions. Extensive evaluation experiments are performed and the findings illustrate that ULS can achieve a reduction of at least 68% in deadline miss rates and 77% in overall tardiness compared to existing methods.","PeriodicalId":13255,"journal":{"name":"IEEE Transactions on Services Computing","volume":"18 2","pages":"758-769"},"PeriodicalIF":5.5,"publicationDate":"2025-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143824130","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Feiyu Zhao;Weiwei Lin;Shengsheng Lin;Shaomin Tang;Keqin Li
{"title":"MSCNet: Multi-Scale Network With Convolutions for Long-Term Cloud Workload Prediction","authors":"Feiyu Zhao;Weiwei Lin;Shengsheng Lin;Shaomin Tang;Keqin Li","doi":"10.1109/TSC.2025.3536313","DOIUrl":"https://doi.org/10.1109/TSC.2025.3536313","url":null,"abstract":"Accurate workload prediction is crucial for resource allocation and management in large-scale cloud data centers. While many approaches have been proposed, most existing methods are based on Recurrent Neural Networks (RNNs) or their variants, focusing on short-term cloud workload prediction without considering or identifying the long-term changes and different periodic patterns of cloud workloads. Due to variations in user demands or workload dynamics, cloud workloads that appear stable in the short term often exhibit distinct patterns in the long term. This can lead to a significant decline in prediction accuracy for existing methods when applied to long-term cloud workload forecasting. To address these challenges and overcome the limitations of current approaches, we propose a Multi-Scale Network with Convolutions (MSCNet) for accurate long-term cloud workload prediction. MSCNet employs multi-scale modeling of the original cloud workload to effectively extract multi-scale features and different periodic patterns, learning the long-term dependencies among the cloud workload. Our core component, the Multi-Scale Block, combines the Multi-Scale Patch Block, Transformer Encoder, and Multi-Scale Convolutions Block for comprehensive multi-scale learning. This enables MSCNet to adaptively learn both short-term and long-term features and patterns of cloud workloads, resulting in accurate long-term cloud workload predictions. Extensive experiments are conducted using real-world cloud workload data from Alibaba, Google, and Azure to validate the effectiveness of MSCNet. The experimental results demonstrate that MSCNet achieves accurate long-term cloud workload prediction with a computational complexity of <inline-formula><tex-math>$O(L^{2}d)$</tex-math></inline-formula>, outperforming existing state-of-the-art methods.","PeriodicalId":13255,"journal":{"name":"IEEE Transactions on Services Computing","volume":"18 2","pages":"969-982"},"PeriodicalIF":5.5,"publicationDate":"2025-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143821683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Energy-Efficient and Accuracy-Aware DNN Inference With IoT Device-Edge Collaboration","authors":"Wei Jiang;Haichao Han;Daquan Feng;Liping Qian;Qian Wang;Xiang-Gen Xia","doi":"10.1109/TSC.2025.3536311","DOIUrl":"https://doi.org/10.1109/TSC.2025.3536311","url":null,"abstract":"Due to the limited energy and computing resources of Internet of Things (IoT) devices, the collaboration of IoT devices and edge servers is considered to handle the complex deep neural network (DNN) inference tasks. However, the heterogeneity of IoT devices and the various accuracy requirements of inference tasks make it difficult to deploy all the DNN models in edge servers. Moreover, a large-scale data transmission is engaged in collaborative inference, resulting in an increased demand on spectrum resource and energy consumption. To address these issues, in this paper, we first design an accuracy-aware multi-branch DNN inference model and quantify the relationship between branch selection and inference accuracy. Then, based on the multi-branch DNN model, we aim to minimize the energy consumption of devices by jointly optimizing the selection of DNN branches and partition layers, as well as the computing and communication resources allocation. The proposed problem is a mixed-integer nonlinear programming problem. We propose a hierarchical approach to decompose the problem, and then solve it with a proportional integral derivative based searching algorithm. Experimental results demonstrate our proposed scheme has better inference performance and can reduce the total energy consumption up to 65.3<inline-formula><tex-math>$%$</tex-math></inline-formula>, compared to other collaboration schemes.","PeriodicalId":13255,"journal":{"name":"IEEE Transactions on Services Computing","volume":"18 2","pages":"784-797"},"PeriodicalIF":5.5,"publicationDate":"2025-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143824648","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yang Xu;Ying Zhu;Zhiyuan Wang;Hongli Xu;Yunming Liao
{"title":"Enhancing Federated Learning Through Layer-Wise Aggregation Over Non-IID Data","authors":"Yang Xu;Ying Zhu;Zhiyuan Wang;Hongli Xu;Yunming Liao","doi":"10.1109/TSC.2025.3536309","DOIUrl":"10.1109/TSC.2025.3536309","url":null,"abstract":"Nowadays, federated learning (FL) has been widely adopted to train deep neural networks (DNNs) among massive devices without revealing their local data in edge computing (EC). To relieve the communication bottleneck of the central server in FL, hierarchical federated learning (HFL), which leverages edge servers as intermediaries to perform model aggregation among devices in proximity, comes into being. Nevertheless, the existing HFL systems may not perform training effectively due to bandwidth constraints and non-IID issues on devices. To conquer these challenges, we introduce an <underline>H</u>FL system with device-<underline>e</u>dge <underline>a</u>ssignment and <underline>l</u>ayer selection, namely Heal. Specifically, Heal organizes all the devices into a hierarchical structure (i.e., device-edge assignment) and enables each device to forward only a sub-model with several valuable layers for aggregation (i.e., layer selection). This processing procedure is called layer-wise aggregation. To further save communication resource and improve the convergence performance, we then design an iteration-based algorithm to optimize the development of our layer-wise aggregation strategy by considering the data distribution as well as resource constraints among devices. Extensive experiments on both the physical platform and the simulated environment show that Heal accelerates DNN training by about 1.4–12.5×, and reduces the network traffic consumption by about 31.9–64.1%, compared with the existing HFL systems.","PeriodicalId":13255,"journal":{"name":"IEEE Transactions on Services Computing","volume":"18 2","pages":"798-811"},"PeriodicalIF":5.5,"publicationDate":"2025-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143056366","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Vertical Federated Unlearning via Backdoor Certification","authors":"Mengde Han;Tianqing Zhu;Lefeng Zhang;Huan Huo;Wanlei Zhou","doi":"10.1109/TSC.2025.3536312","DOIUrl":"10.1109/TSC.2025.3536312","url":null,"abstract":"Vertical Federated Learning (VFL) offers a novel paradigm in machine learning, enabling distinct entities to train models cooperatively while maintaining data privacy. This method is particularly pertinent when entities possess datasets with identical sample identifiers but diverse attributes. Recent privacy regulations emphasize an individual's <italic>right to be forgotten</i>, which necessitates the ability for models to unlearn specific training data. The primary challenge is to develop a mechanism to eliminate the influence of a specific client from a model without erasing all relevant data from other clients. Our research investigates the removal of a single client's contribution within the VFL framework. We introduce an innovative modification to traditional VFL by employing a mechanism that inverts the typical learning trajectory with the objective of extracting specific data contributions. This approach seeks to optimize model performance using gradient ascent, guided by a pre-defined constrained model. We also introduce a backdoor mechanism to verify the effectiveness of the unlearning procedure. Our method avoids fully accessing the initial training data and avoids storing parameter updates. Empirical evidence shows that the results align closely with those achieved by retraining from scratch. Utilizing gradient ascent, our unlearning approach addresses key challenges in VFL, laying the groundwork for future advancements in this domain.","PeriodicalId":13255,"journal":{"name":"IEEE Transactions on Services Computing","volume":"18 2","pages":"1110-1123"},"PeriodicalIF":5.5,"publicationDate":"2025-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143057118","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
An Du;Jie Jia;Schahram Dustdar;Jian Chen;Xingwei Wang
{"title":"Online Service Placement, Task Scheduling, and Resource Allocation in Hierarchical Collaborative MEC Systems","authors":"An Du;Jie Jia;Schahram Dustdar;Jian Chen;Xingwei Wang","doi":"10.1109/TSC.2025.3536307","DOIUrl":"10.1109/TSC.2025.3536307","url":null,"abstract":"Mobile edge computing (MEC) pushes cloud computing capabilities to the network edge, which provides real-time processing and caching flexibility for service-based applications. Conventionally, the individual node solution is insufficient to tackle the increasing computation workload and provide diverse services, especially for unpredictable spatiotemporal service request patterns. To address this, we first propose a hierarchical collaborative computing (HCC) framework to serve users’ demands by reaping sufficient computing capability in Cloud, ubiquitous service area in edge layer, and idle resources in device layer. To better unleash the benefits of HCC and pursue long-term performance, we investigate heterogeneity-aware resource management by collaborative service placement, task scheduling, and resource allocation both in-node and cross-node. We then propose an online optimization framework that first decouples the decisions across different slots. For each instant mixed integer non-linear programming problem, we introduce the surrogate Lagrangian relaxation method to reduce complexity and design hybrid numerical techniques to solve the subproblems. Theoretical analysis and extensive simulation results demonstrate the efficiency of the HCC framework in decreasing system cost on devices, and our proposed algorithms can effectively utilize the resources in the collaborative space to achieve the trade-off between system cost minimization and service placement cost stability.","PeriodicalId":13255,"journal":{"name":"IEEE Transactions on Services Computing","volume":"18 2","pages":"983-997"},"PeriodicalIF":5.5,"publicationDate":"2025-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143056298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Towards Cost-Optimal Policies for DAGs to Utilize IaaS Clouds with Online Learning","authors":"Xiaohu Wu, Han Yu, Giuliano Casale, Guanyu Gao","doi":"10.1109/tsc.2025.3536305","DOIUrl":"https://doi.org/10.1109/tsc.2025.3536305","url":null,"abstract":"","PeriodicalId":13255,"journal":{"name":"IEEE Transactions on Services Computing","volume":"20 1","pages":""},"PeriodicalIF":8.1,"publicationDate":"2025-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143056299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"TrustPay: A Dual-Layer Blockchain-Based Framework for Trusted Service Transaction","authors":"Shengye Pang;Xinkui Zhao;Shuyi Yu;Jintao Chen;Shuiguang Deng;Jianwei Yin","doi":"10.1109/TSC.2025.3534619","DOIUrl":"10.1109/TSC.2025.3534619","url":null,"abstract":"Web service-oriented transactions have become an integral part of the Internet economy, with the mainstream transaction patterns relying primarily on cloud service markets. However, traditional service transaction methods have deficiencies in terms of both trust and scalability. Distrust between service provider (SP) and consumer (SC), particularly around online payments and data security, impedes the further growth of service transactions. Although blockchain-based transaction mechanisms have made notable progress in addressing trust issues, they still face performance bottlenecks. To tackle these challenges, this paper introduces TrustPay, a service transaction framework that leverages a dual-layer blockchain structure consisting of a parent chain and multiple subchains. The framework partitions the subchain network based on business, with each subchain dedicated to storing service invocation records generated within a specific business unit. The smart contract deployed on the parent chain will settle the invocation records in all subchain networks as transaction records and facilitate automatic transfers among blockchain accounts. This design leverages blockchain's inherent reliability while improving its scalability in large-scale scenarios. Additionally, a novel consensus protocol, REFEREE, is introduced and applied to the subchain network, ensuring efficient recording of invocation data and trusted verification among participants, further enhancing both trust and performance. Comparative experiments and analysis show that TrustPay's dual-layer blockchain structure and REFEREE protocol are not only reliable but also outperform baseline methods in terms of efficiency.","PeriodicalId":13255,"journal":{"name":"IEEE Transactions on Services Computing","volume":"18 2","pages":"1068-1080"},"PeriodicalIF":5.5,"publicationDate":"2025-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143056367","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Lightweight and Privacy-Preserving Reconfigurable Authentication Scheme for IoT Devices","authors":"Prosanta Gope;Fei Hongming;Biplab Sikdar","doi":"10.1109/TSC.2025.3536314","DOIUrl":"10.1109/TSC.2025.3536314","url":null,"abstract":"The Internet of Things (IoT) has revolutionized connectivity by enabling a large number of devices to autonomously exchange real-time data over the Internet. However, IoT devices used in public spaces are vulnerable to physical and cloning attacks. To address this issue, researchers have introduced the concept of physical-unclonable functions (PUFs) to enhance security in IoT applications. While PUF-based security solutions typically rely on static challenge-response behavior, many practical applications require dynamic or reconfigurable PUFs. For instance, PUF-based key storage may require updating or revoking secrets, and protection against modeling attacks, where an attacker can derive a PUF model from a set of challenge-response pairs (CRPs) using learning capabilities. In this paper, we introduce LR-OPUF, a reconfigurable one-time PUF, and propose a lightweight and privacy-preserving authentication scheme based on this LR-OPUF foundation. One notable feature of our authentication scheme is that it enables a device to prove its legitimacy to a semi-honest verifier without disclosing the CRPs. Through security and performance analyses, we demonstrate that our approach not only ensures vital security aspects but also exhibits high computational efficiency.","PeriodicalId":13255,"journal":{"name":"IEEE Transactions on Services Computing","volume":"18 2","pages":"912-925"},"PeriodicalIF":5.5,"publicationDate":"2025-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143056297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kai Peng;Yi Hu;Haonan Ding;Haoxuan Chen;Liangyuan Wang;Chao Cai;Menglan Hu
{"title":"Large-Scale Service Mesh Orchestration With Probabilistic Routing in Cloud Data Centers","authors":"Kai Peng;Yi Hu;Haonan Ding;Haoxuan Chen;Liangyuan Wang;Chao Cai;Menglan Hu","doi":"10.1109/TSC.2025.3526373","DOIUrl":"10.1109/TSC.2025.3526373","url":null,"abstract":"Service mesh architectures are emerging as a promising microservice paradigm for developing online cloud applications. However, in large-scale microservice scenarios, frequent service communications, intricate call dependencies, and stringent latency requirements bring great pressure to efficient service mesh orchestration. In this case, the problems of service deployment and request routing based on service mesh architectures are tightly-coupled and interdependent, and cannot be effectively optimized individually, enlarging the difficulty for collaborative orchestration. When microservice multiplexing, parallel dependencies, and multi-instance modeling are considered, the difficulty is further aggravated. Nonetheless, most existing work failed to propose appropriate models and methods for the above challenges. Therefore, this article studies the large-scale service mesh orchestration with probabilistic routing and constrained bandwidths for parallel call graphs. We leverage the open Jackson queuing network theory to capture crucial microservices and analyze request processing, queuing, and communication latency for massive user requests in a fine-grained way. Then, this article proposes an efficient three-stage heuristic, which achieves elegant multi-instance consolidation and probabilistic multi-queue routing to reduce response latency and cost. We also provide the algorithm complexity and mathematical analysis of the performance. Finally, extensive trace-driven experiments are performed to validate the superiority of our proposed algorithm over other baselines.","PeriodicalId":13255,"journal":{"name":"IEEE Transactions on Services Computing","volume":"18 2","pages":"868-882"},"PeriodicalIF":5.5,"publicationDate":"2025-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142991505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}