IEEE Transactions on Computers最新文献_第2页

Optimizing Serverless Performance Through Game Theory and Efficient Resource Scheduling 通过博弈论和高效资源调度优化无服务器性能

IF 3.6 2区计算机科学

IEEE Transactions on Computers Pub Date : 2025-03-03 DOI: 10.1109/TC.2025.3547158

Pengwei Wang;Yi Li;Chao Fang;Yichen Zhong;Zhijun Ding

{"title":"Optimizing Serverless Performance Through Game Theory and Efficient Resource Scheduling","authors":"Pengwei Wang;Yi Li;Chao Fang;Yichen Zhong;Zhijun Ding","doi":"10.1109/TC.2025.3547158","DOIUrl":"https://doi.org/10.1109/TC.2025.3547158","url":null,"abstract":"The scaler and scheduler of serverless system are the two cornerstones that ensure service quality and efficiency. However, existing scalers and schedulers are constrained by static thresholds, scaling latency, and single-dimensional optimization, making them difficult to agilely respond to dynamic workloads of functions with different characteristics. This paper proposes a game theory-based scaler and a dual-layer optimization scheduler to enhance the resource management and task allocation capabilities of serverless systems. In the scaler, we introduce the Hawkes process to quantify the “temperature” of function as an indicator of their instantaneous invocation rate. By combining dynamic thresholds and continuous monitoring, this scaler enables that scaling operations no longer lag behind changes of function instances and can even warm up beforehand. For scheduler, we refer to bin-packing strategies to optimize the distribution of containers and reduce resource fragmentation. A new concept of “CPU starvation degree” is introduced to denote the degree of CPU contention during function execution, ensuring that function requests are efficiently scheduled. Experimental analysis on ServerlessBench and Alibaba clusterdata indicates that compared to classical and state-of-the-art scalers and schedulers, the proposed scaler and scheduler achieve at least a 149% improvement in the Quality-Price Ratio, which represents the trade-off between performance and cost.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"74 6","pages":"1990-2002"},"PeriodicalIF":3.6,"publicationDate":"2025-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143929824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

RTSA: A Run-Through Sparse Attention Framework for Video Transformer RTSA：一种用于视频转换器的贯穿式稀疏注意力框架

IF 3.6 2区计算机科学

IEEE Transactions on Computers Pub Date : 2025-03-03 DOI: 10.1109/TC.2025.3547139

Xuhang Wang;Zhuoran Song;Chunyu Qi;Fangxin Liu;Naifeng Jing;Li Jiang;Xiaoyao Liang

{"title":"RTSA: A Run-Through Sparse Attention Framework for Video Transformer","authors":"Xuhang Wang;Zhuoran Song;Chunyu Qi;Fangxin Liu;Naifeng Jing;Li Jiang;Xiaoyao Liang","doi":"10.1109/TC.2025.3547139","DOIUrl":"https://doi.org/10.1109/TC.2025.3547139","url":null,"abstract":"In the realm of video understanding tasks, Video Transformer models (VidT) have recently exhibited impressive accuracy improvements in numerous edge devices. However, their deployment poses significant computational challenges for hardware. To address this, pruning has emerged as a promising approach to reduce computation and memory requirements by eliminating unimportant elements from the attention matrix. Unfortunately, existing pruning algorithms face a limitation in that they only optimize one of the two key modules on VidT's critical path: linear projection or self-attention. Regrettably, due to the variation in battery power in edge devices, the video resolution they generate will also change, which causes both linear projection and self-attention stages to potentially become bottlenecks, the existing approaches lack generality. Accordingly, we establish a Run-Through Sparse Attention (RTSA) framework that simultaneously sparsifies and accelerates two stages. On the algorithm side, unlike current methodologies conducting sparse linear projection by exploring redundancy within each frame, we extract extra redundancy naturally existing between frames. Moreover, for sparse self-attention, as existing pruning algorithms often provide either too coarse-grained or fine-grained sparsity patterns, these algorithms face limitations in simultaneously achieving high sparsity, low accuracy loss, and high speedup, resulting in either compromised accuracy or reduced efficiency. Thus, we prune the attention matrix at a medium granularity—sub-vector. The sub-vectors are generated by isolating each column of the attention matrix. On the hardware side, we observe that the use of distinct computational units for sparse linear projection and self-attention results in pipeline imbalances because of the bottleneck transformation between the two stages. To effectively eliminate pipeline stall, we design a RTSA architecture that supports sequential execution of both sparse linear projection and self-attention. To achieve this, we devised an atomic vector-scalar product computation underpinning all calculations in parse linear projection and self-attention, as well as evolving a spatial array architecture with augmented processing elements (PEs) tailored for the vector-scalar product. Experiments on VidT models show that RTSA can save 2.71<inline-formula><tex-math>$boldsymbol{times}$</tex-math></inline-formula> to 5.32<inline-formula><tex-math>$boldsymbol{times}$</tex-math></inline-formula> ideal computation with <inline-formula><tex-math>$ lt 1%$</tex-math></inline-formula> accuracy loss, achieving 105<inline-formula><tex-math>$boldsymbol{times}$</tex-math></inline-formula>, 56.8<inline-formula><tex-math>$boldsymbol{times}$</tex-math></inline-formula>, 3.59<inline-formula><tex-math>$boldsymbol{times}$</tex-math></inline-formula>, and 3.31<inline-formula><tex-math>$boldsymbol{times}$</tex-math></inline-formula> speedup compared to CPU, GPU, as well as the state-of-","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"74 6","pages":"1949-1962"},"PeriodicalIF":3.6,"publicationDate":"2025-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143929659","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

PFed-NS: An Adaptive Personalized Federated Learning Scheme Through Neural Network Segmentation 基于神经网络分割的自适应个性化联邦学习方案

IF 3.6 2区计算机科学

IEEE Transactions on Computers Pub Date : 2025-02-28 DOI: 10.1109/TC.2025.3547138

Yuchen Liu;Ligang He;Zhigao Zhang;Shenyuan Ren

{"title":"PFed-NS: An Adaptive Personalized Federated Learning Scheme Through Neural Network Segmentation","authors":"Yuchen Liu;Ligang He;Zhigao Zhang;Shenyuan Ren","doi":"10.1109/TC.2025.3547138","DOIUrl":"https://doi.org/10.1109/TC.2025.3547138","url":null,"abstract":"Federated Learning (FL) is typically deployed in a client-server architecture, which makes the Edge-Cloud architecture an ideal backbone for FL. A significant challenge in this setup arises from the diverse data feature distributions across different edge locations (i.e., non-IID data). In response, Personalized Federated Learning (PFL) approaches have been developed. Network segmentation-based PFL is an important approach to achieving PFL, in which the training network is divided into a global segment for server aggregation and a local segment maintained client-side. Existing methods determine the segmentation before the training, and the segmentation remains fixed throughout the PFL training. However, our investigation reveals that model representations vary as PFL progresses and the fixed segmentation may not deliver best performance across various training settings. To address this, we propose PFed-NS, a PFL framework based on adaptive network segmentation. This adaptive segmentation technique is composed of two elements: a mechanism for assessing divergence of clients’ probability density functions constructed from network layers’ outputs, and a model for dynamically establishing divergence thresholds, beyond which server aggregation is deemed detrimental. Further optimization strategies are proposed to reduce the computation and communication costs incurred by divergence modeling. Moreover, we propose a divergence-based BN strategy to optimize BN performance for network segmentation-based PFL. Extensive experiments have been conducted to compare PFed-NS against recent PFL models. The results demonstrate its superiority in enhancing model accuracy and accelerating convergence.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"74 6","pages":"1936-1948"},"PeriodicalIF":3.6,"publicationDate":"2025-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143929684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Deep Learning-Based Cloud Security: Innovative Attack Detection and Privacy Focused Key Management 基于深度学习的云安全：创新的攻击检测和专注于隐私的密钥管理

IF 3.6 2区计算机科学

IEEE Transactions on Computers Pub Date : 2025-02-28 DOI: 10.1109/TC.2025.3547150

Shahnawaz Ahmad;Mohd Arif;Shabana Mehfuz;Javed Ahmad;Mohd Nazim

{"title":"Deep Learning-Based Cloud Security: Innovative Attack Detection and Privacy Focused Key Management","authors":"Shahnawaz Ahmad;Mohd Arif;Shabana Mehfuz;Javed Ahmad;Mohd Nazim","doi":"10.1109/TC.2025.3547150","DOIUrl":"https://doi.org/10.1109/TC.2025.3547150","url":null,"abstract":"Cloud Computing (CC) is widely adopted in sectors like education, healthcare, and banking due to its scalability and cost-effectiveness. However, its internet-based nature exposes it to cyber threats, necessitating advanced security frameworks. Traditional models suffer from high false positives and limited adaptability. To address these challenges, VECGLSTM, an attack detection model integrating Variable Long Short-Term Memory (VLSTM), capsule networks, and the Enhanced Gannet Optimization Algorithm (EGOA), is introduced. This hybrid approach enhances accuracy, reduces false positives, and dynamically adapts to evolving threats. EGOA is employed for its superior optimization capability, ensuring faster convergence and resilience. Additionally, Chaotic Cryptographic Pelican Tunicate Swarm Optimization (CCPTSO) is proposed for privacy-preserving key management. This model combines chaotic cryptographic techniques with the Pelican Tunicate Swarm Optimization Algorithm (PTSOA), leveraging the pelican algorithm’s exploration strength and the tunicate swarm’s exploitation ability for optimal encryption security. Performance evaluation demonstrates 99.675% accuracy, 99.5175% recall, 99.7075% precision, and 99.615% F1-score, along with reduced training (1.79s), encryption (0.986s), and decryption (1.029s) times. This research significantly enhances CC security by providing a scalable, adaptive framework that effectively counters evolving cyber threats while ensuring efficient key management.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"74 6","pages":"1978-1989"},"PeriodicalIF":3.6,"publicationDate":"2025-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143929657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Eliminate Data Divergence in SpMV via Processor and Memory Co-Computing Framework 利用处理器和存储器协同计算框架消除SpMV中的数据发散

IF 3.6 2区计算机科学

IEEE Transactions on Computers Pub Date : 2025-02-28 DOI: 10.1109/TC.2025.3547162

Zhang Dunbo;Shen Li;Lu Kai

{"title":"Eliminate Data Divergence in SpMV via Processor and Memory Co-Computing Framework","authors":"Zhang Dunbo;Shen Li;Lu Kai","doi":"10.1109/TC.2025.3547162","DOIUrl":"https://doi.org/10.1109/TC.2025.3547162","url":null,"abstract":"Sparse matrix-vector multiplication (SpMV) is a performance-critical kernel in various application domains, including high-performance computing, artificial intelligence, and big data. However, the performance of SpMV on SIMD devices is greatly affected by data divergences. To address this issue, we propose an In-SRAM Computing-based Processor Memory Co-Compute SpMV optimization framework that divides the SpMV kernel into two stages: a compute-intensive stage and a control-intensive stage. For optimizing the first stage, we leverage the parallel random access feature of multi-bank SRAM to eliminate overheads caused by memory divergences and use the Aggregate Table (AT) to reduce bank conflicts. For optimizing the second stage, we convert control divergences into memory divergences and utilize the Accumulate ScratchPad Memory (AccSPM) for executing reduction operations while eliminating overheads caused by memory divergences. Experimental results demonstrate that our solution achieves significant throughput increase over highly optimized vector SpMV kernels under CSR, CSR5, and CVR compression formats with performance speedups up to 4.74x, 5.58x, and 4.83x (3.11x, 3.04x, and 3.07x on average), respectively.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"74 6","pages":"2017-2030"},"PeriodicalIF":3.6,"publicationDate":"2025-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143929739","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Reliable Communication Scheme Based on Completely Independent Spanning Trees in Data Center Networks 基于完全独立生成树的数据中心网络可靠通信方案

IF 3.6 2区计算机科学

IEEE Transactions on Computers Pub Date : 2025-02-28 DOI: 10.1109/TC.2025.3547161

Hui Dong;Huaqun Wang;Mengjie Lv;Weibei Fan

{"title":"Reliable Communication Scheme Based on Completely Independent Spanning Trees in Data Center Networks","authors":"Hui Dong;Huaqun Wang;Mengjie Lv;Weibei Fan","doi":"10.1109/TC.2025.3547161","DOIUrl":"https://doi.org/10.1109/TC.2025.3547161","url":null,"abstract":"With technological advancements, real-time applications have permeated various aspects of human life, relying on fast, reliable, and low-latency data transmission for seamless user experiences. The development of data center networks (DCNs) has greatly advanced real-time applications, with network reliability being a key factor in ensuring high-quality network services. As a switch-centric DCN, DPCell has good scalability and the ability to achieve load balancing at different traffic levels. With the increasing demand for high availability, fault tolerance, and efficient data transmission, highly reliable communication for DPCell is essential. Completely independent spanning trees (CISTs) play a significant role in enhancing reliable communication performance in networks. This paper proposes an algorithm for constructing CISTs in DPCell, which has relatively low time and space consumption compared to other CISTs construction algorithms in DCNs, offering an efficiency advantage. Communication simulations validate the effectiveness of using paths provided by CISTs in DPCell for data transmission. Furthermore, experimental results show that a multi-protection routing scheme configured with multiple CISTs significantly enhances fault tolerance in DPCell.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"74 6","pages":"2003-2016"},"PeriodicalIF":3.6,"publicationDate":"2025-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143929853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Exploring Hyperdimensional Computing Robustness Against Hardware Errors 探索针对硬件错误的超维计算鲁棒性

IF 3.6 2区计算机科学

IEEE Transactions on Computers Pub Date : 2025-02-28 DOI: 10.1109/TC.2025.3547142

Sizhe Zhang;Kyle Juretus;Xun Jiao

{"title":"Exploring Hyperdimensional Computing Robustness Against Hardware Errors","authors":"Sizhe Zhang;Kyle Juretus;Xun Jiao","doi":"10.1109/TC.2025.3547142","DOIUrl":"https://doi.org/10.1109/TC.2025.3547142","url":null,"abstract":"Brain-inspired hyperdimensional computing (HDC) is an emerging machine learning paradigm leveraging high-dimensional spaces for efficient tasks like pattern recognition and medical diagnostics. As a lightweight alternative to deep neural networks, HDC offers smaller model sizes, reduced computation, and memory-centric processing. However, deploying HDC in safety-critical applications, such as healthcare and robotics, is challenged by hardware-induced errors. This paper investigates HDC's robustness to memory errors via extensive bit-flip injection experiments on item and associative memories. Results reveal that certain bit-flips severely degrade accuracy. To address this, we introduce the Hyperdimensional Bit-Flip Search (HD-BFS), a similarity-guided method for identifying vulnerabilities and crafting efficient attacks, where flipping just 6 critical bits—3.9% of random bit-flips—reduces accuracy to chance levels. We further propose Hyperdimensional Accelerated Bit-Flip Search (HD-ABFS), which narrows the search space by targeting critical dimensions and most significant bits (MSBs), achieving up to 282<inline-formula><tex-math>$times$</tex-math></inline-formula> speedup over HD-BFS. Finally, we develop an effective protection mechanism to enhance model safety. These insights highlight HDC's resilience to random errors, offer robust defenses against targeted attacks, and advance the security and reliability of HDC systems.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"74 6","pages":"1963-1977"},"PeriodicalIF":3.6,"publicationDate":"2025-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143929672","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

CoSpMV: Towards Agile Software and Hardware Co-Design for SpMV Computation 面向SpMV计算的敏捷软硬件协同设计

IF 3.6 2区计算机科学

IEEE Transactions on Computers Pub Date : 2025-02-28 DOI: 10.1109/TC.2025.3547136

Minghao Tian;Yue Liang;Bowen Liu;Dajiang Liu

{"title":"CoSpMV: Towards Agile Software and Hardware Co-Design for SpMV Computation","authors":"Minghao Tian;Yue Liang;Bowen Liu;Dajiang Liu","doi":"10.1109/TC.2025.3547136","DOIUrl":"https://doi.org/10.1109/TC.2025.3547136","url":null,"abstract":"Sparse Matrix-Vector multiplication (SpMV) is a widely used kernel in scientific or engineering applications and it is commonly implemented in FPGAs for acceleration. Existing works on FPGA usually pre-process the sparse matrix for data compression from the software perspective, and then design a unified architecture from the hardware perspective. However, as different SpMV kernels expose different levels of data parallelism after software processing, a unified architecture may not efficiently tap the underlying parallelism exposed in a specific kernel, leading to poor bandwidth utilization (BU) or poor resource utilization. To this end, this paper proposes an agile software and hardware co-design framework, CoSpMV, that employs design space exploration on both software and hardware for a specific kernel. Specifically, by providing a scalable compressed data format and a highly pipelined hardware template, CoSpMV can select the most suitable software and hardware configurations for different kernels and generate the accelerator instantly. The experimental results show that CoSpMV can achieve 3.91<inline-formula><tex-math>$times$</tex-math></inline-formula> speedup on GFLOPs, and 1.31<inline-formula><tex-math>$times$</tex-math></inline-formula> speedup on BU compared to the state-of-the-art work.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"74 6","pages":"1921-1935"},"PeriodicalIF":3.6,"publicationDate":"2025-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143929685","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

SAFA: Handling Sparse and Scarce Data in Federated Learning With Accumulative Learning 基于累积学习的联邦学习中稀疏和稀缺数据的处理

IF 3.6 2区计算机科学

IEEE Transactions on Computers Pub Date : 2025-02-28 DOI: 10.1109/TC.2025.3543682

Nang Hung Nguyen;Truong Thao Nguyen;Trong Nghia Hoang;Hieu H. Pham;Thanh Hung Nguyen;Phi Le Nguyen

{"title":"SAFA: Handling Sparse and Scarce Data in Federated Learning With Accumulative Learning","authors":"Nang Hung Nguyen;Truong Thao Nguyen;Trong Nghia Hoang;Hieu H. Pham;Thanh Hung Nguyen;Phi Le Nguyen","doi":"10.1109/TC.2025.3543682","DOIUrl":"https://doi.org/10.1109/TC.2025.3543682","url":null,"abstract":"Federated Learning (FL) has emerged as an effective paradigm allowing multiple parties to collaboratively train a global model while protecting their private data. However, it is observed that the performance of FL approaches tends to degrade significantly when data are sparsely distributed across clients with small datasets. This is referred to as the sparse-and-scarce challenge, where data held by each client is both sparse (does not contain examples to all classes) and scarce (small dataset). Sparse-and-scarce data diminishes the generalizability of clients’ data, leading to intensive over-fitting and massive domain shifts in the local models and, ultimately, decreasing the aggregated model's performance. Interestingly, while this scenario is a specific manifestation of the well-known non-IID<xref>1</xref><fn><label>1</label>This refers to the generic situation where local data distributions are not identical and independently distributed.</fn> challenge in FL, it has not been distinctly addressed. Our empirical investigation highlights that generic approaches to the non-IID challenge often prove inadequate in mitigating the sparse-and-scarce issue. To bridge this gap, we develop SAFA, a novel FL algorithm that specifically addresses the sparse-and-scarce challenge via a novel continual model iteration procedure. SAFA maximally exposes local models to the inter-client diversity of data with minimal effects of catastrophic forgetting. Our experiments show that SAFA outperforms existing FL solutions, up to 17.86%, compared to the prominent baseline. The code is accessible via <uri>https://github.com/HungNguyen20/SAFA</uri>.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"74 6","pages":"1844-1856"},"PeriodicalIF":3.6,"publicationDate":"2025-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143929788","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Path-Based Topology-Agnostic Fault Diagnosis Strategy for Multiprocessor Systems 基于路径的多处理器系统拓扑不可知故障诊断策略

IF 3.6 2区计算机科学

IEEE Transactions on Computers Pub Date : 2025-02-26 DOI: 10.1109/TC.2025.3543701

Lin Chen;Hao Feng;Jiong Wu

{"title":"A Path-Based Topology-Agnostic Fault Diagnosis Strategy for Multiprocessor Systems","authors":"Lin Chen;Hao Feng;Jiong Wu","doi":"10.1109/TC.2025.3543701","DOIUrl":"https://doi.org/10.1109/TC.2025.3543701","url":null,"abstract":"Fault diagnosis technology is a method for locating faulty processors in multiprocessor systems, and it plays a crucial role in ensuring system stability, security and reliability. A widely used approach in this technology is the system-level strategy, which determines processor status by interpreting the set of test results between adjacent processors. Among them, the PMC and MM models are two commonly employed methods for generating these results. The diversity and complexity of network topologies in systems constrain existing algorithms to specific topologies, while the limitations of fault diagnosis strategies lead to reduced fault tolerance. In this paper, we present a novel path-based method to tackle the fault diagnosis problems in various networks according to the PMC and MM models. Firstly, we introduce the algorithm for partitioning the path into subpaths based on these models. To ensure that at least one subpath is diagnosed as fault-free, we derive the relationship between the fault bound <inline-formula><tex-math>$T$</tex-math></inline-formula> and the path length <inline-formula><tex-math>$N$</tex-math></inline-formula>. Then, building on methods for recognizing the subpath states, we have developed fault diagnosis algorithms for both the PMC and MM models. The simulation results show that our proposed algorithms can quickly and accurately diagnose faults in multiprocessor systems.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"74 6","pages":"1886-1896"},"PeriodicalIF":3.6,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143929678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0