{"title":"Dynamic Hardware Defense for High-Centrality Nodes in Graph Convolutional Networks","authors":"Chang Cai;Peiyu Li;Wangshen Wen;Zeqi Huang;Youming Peng;Minchi Hu;Zehao Wu;Lei Shen;Jing Zhang","doi":"10.1109/TNS.2025.3595388","DOIUrl":null,"url":null,"abstract":"With the growing reliance on graph data processing in safety-critical applications, ensuring the reliability of graph convolutional network (GCN) hardware systems has become paramount, especially in radiation-prone environments where single-event upsets (SEUs) pose significant risks. This article presents a comprehensive design framework for a fault-tolerant GCN system, addressing the unique challenges of SEU susceptibility through a novel fault-aware centrality measure and a dynamic hardware defense (DHD) strategy. Our approach begins with the development of a fault-aware centrality measure to precisely model the distribution of critical nodes within graph data. Leveraging this measure, we design a DHD strategy that integrates partial circuit reinforcement and high-centrality node marking, which dynamically route the dataflow of the most influential nodes in the graph to reinforced circuit units. The proposed system architecture is rigorously validated through extensive experiments, demonstrating significant improvements in fault tolerance. Specifically, the DHD strategy achieves improvements in hardening efficiency of <inline-formula> <tex-math>$2.53\\times $ </tex-math></inline-formula>, <inline-formula> <tex-math>$2.70\\times $ </tex-math></inline-formula>, and <inline-formula> <tex-math>$2.85\\times $ </tex-math></inline-formula>, respectively, compared with the traditional full triple modular redundancy (FTMR) strategy. Neutron radiation testing further validates the robustness of the system, showing effective mitigation of fault propagation under extreme conditions. By maintaining minimal hardware overhead and offering dynamic, cost-effective protection, this design framework provides a reliable solution for deploying GCNs in safety-critical applications.","PeriodicalId":13406,"journal":{"name":"IEEE Transactions on Nuclear Science","volume":"72 9","pages":"3052-3063"},"PeriodicalIF":1.9000,"publicationDate":"2025-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Nuclear Science","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/11109057/","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
With the growing reliance on graph data processing in safety-critical applications, ensuring the reliability of graph convolutional network (GCN) hardware systems has become paramount, especially in radiation-prone environments where single-event upsets (SEUs) pose significant risks. This article presents a comprehensive design framework for a fault-tolerant GCN system, addressing the unique challenges of SEU susceptibility through a novel fault-aware centrality measure and a dynamic hardware defense (DHD) strategy. Our approach begins with the development of a fault-aware centrality measure to precisely model the distribution of critical nodes within graph data. Leveraging this measure, we design a DHD strategy that integrates partial circuit reinforcement and high-centrality node marking, which dynamically route the dataflow of the most influential nodes in the graph to reinforced circuit units. The proposed system architecture is rigorously validated through extensive experiments, demonstrating significant improvements in fault tolerance. Specifically, the DHD strategy achieves improvements in hardening efficiency of $2.53\times $ , $2.70\times $ , and $2.85\times $ , respectively, compared with the traditional full triple modular redundancy (FTMR) strategy. Neutron radiation testing further validates the robustness of the system, showing effective mitigation of fault propagation under extreme conditions. By maintaining minimal hardware overhead and offering dynamic, cost-effective protection, this design framework provides a reliable solution for deploying GCNs in safety-critical applications.
期刊介绍:
The IEEE Transactions on Nuclear Science is a publication of the IEEE Nuclear and Plasma Sciences Society. It is viewed as the primary source of technical information in many of the areas it covers. As judged by JCR impact factor, TNS consistently ranks in the top five journals in the category of Nuclear Science & Technology. It has one of the higher immediacy indices, indicating that the information it publishes is viewed as timely, and has a relatively long citation half-life, indicating that the published information also is viewed as valuable for a number of years.
The IEEE Transactions on Nuclear Science is published bimonthly. Its scope includes all aspects of the theory and application of nuclear science and engineering. It focuses on instrumentation for the detection and measurement of ionizing radiation; particle accelerators and their controls; nuclear medicine and its application; effects of radiation on materials, components, and systems; reactor instrumentation and controls; and measurement of radiation in space.