{"title":"Continual Test-Time Adaptation With Weighted Contrastive Learning and Pseudo-Label Correction","authors":"Shih-Chieh Chuang;Ching-Hu Lu","doi":"10.1109/TETC.2025.3528985","DOIUrl":"https://doi.org/10.1109/TETC.2025.3528985","url":null,"abstract":"Real-time adaptability is often required to maintain system accuracy in scenarios involving domain shifts caused by constantly changing environments. While continual test-time adaptation has been proposed to handle such scenarios, existing methods rely on high-accuracy pseudo-labels. Moreover, contrastive learning methods for continuous test-time adaptation consider the aggregation of features from the same class while neglecting the problem of aggregating similar features within the same class. Therefore, we propose “Weighted Contrastive Learning” and apply it to both pre-training and continual test-time adaptation. To address the issue of catastrophic forgetting caused by continual adaptation, previous studies have employed source-domain knowledge to stochastically recover the target-domain model. However, significant domain shifts may cause the source-domain knowledge to behave as noise, thus impacting the model's adaptability. Therefore, we propose “Domain-aware Pseudo-label Correction” to mitigate catastrophic forgetting and error accumulation without accessing the original source-domain data while minimizing the impact on model adaptability. The thorough evaluations in our experiments have demonstrated the effectiveness of our proposed approach.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"13 3","pages":"866-877"},"PeriodicalIF":5.4,"publicationDate":"2025-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145057433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Pervasive Edge Computing Model for Proactive Intelligent Data Migration","authors":"Georgios Boulougaris;Kostas Kolomvatsos","doi":"10.1109/TETC.2025.3528994","DOIUrl":"https://doi.org/10.1109/TETC.2025.3528994","url":null,"abstract":"Currently, there is a great attention of the research community for the intelligent management of data in a context-aware manner at the intersection of the Internet of Things (IoT) and Edge Computing (EC). In this article, we propose a strategy to be adopted by autonomous edge nodes related to their decision on what data should be migrated to specific locations of the infrastructure and support the desired requests for processing. Our intention is to arm nodes with the ability of learning the access patterns of offloaded data-driven tasks and predict which data should be migrated to the original ‘owners’ of tasks. Naturally, these tasks are linked to the processing of data that are absent at the original hosting nodes indicating the required data assets that need to be accessed directly. To identify these data intervals, we employ an ensemble scheme that combines a statistically oriented model and a machine learning scheme. Hence, we are able not only to detect the density of the requests but also to learn and infer the ‘strong’ data assets. The proposed approach is analyzed in detail by presenting the corresponding formulations being also evaluated and compared against baselines and models found in the respective literature.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"13 3","pages":"878-889"},"PeriodicalIF":5.4,"publicationDate":"2025-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145057421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Amir Sabbagh Molahosseini;JunKyu Lee;Hans Vandierendonck
{"title":"Software-Defined Number Formats for High-Speed Belief Propagation","authors":"Amir Sabbagh Molahosseini;JunKyu Lee;Hans Vandierendonck","doi":"10.1109/TETC.2025.3528972","DOIUrl":"https://doi.org/10.1109/TETC.2025.3528972","url":null,"abstract":"This article presents the design and implementation of Software-Defined Floating-Point (SDF) number formats for high-speed implementation of the Belief Propagation (BP) algorithm. SDF formats are designed specifically to meet the numeric needs of the computation and are more compact representations of the data. They reduce memory footprint and memory bandwidth requirements without sacrificing accuracy, given that BP for loopy graphs inherently involves algorithmic errors. This article designs several SDF formats for sum-product BP applications by careful analysis of the computation. Our theoretical analysis leads to the design of 16-bit (half-precision) and 8-bit (mini-precision) widths. We moreover present highly efficient software implementation of the proposed SDF formats which is centered around conversion to hardware-supported single-precision arithmetic hardware. Our solution demonstrates negligible conversion overhead on commercially available CPUs. For Ising grids with sizes from 100 × 100 to 500 × 500, the 16- and 8-bit SDF formats along with our conversion module produce equivalent accuracy to double-precision floating-point format but with 2.86× speedups on average on an Intel Xeon processor. Particularly, increasing the grid size results in higher speed-up. For example, the proposed half-precision format with 3-bit exponent and 13-bit mantissa achieved the minimum and maximum speedups of 1.30× and 1.39× over single-precision, and 2.55× and 3.40× over double-precision, by increasing grid size from 100 × 100 to 500 × 500.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"13 3","pages":"853-865"},"PeriodicalIF":5.4,"publicationDate":"2025-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145051083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Scatter-Gather DMA Performance Analysis Within an SoC-Based Control System for Trapped-Ion Quantum Computing","authors":"Tiamike Dudley;Jim Plusquellic;Eirini Eleni Tsiropoulou;Joshua Goldberg;Daniel Stick;Daniel Lobser","doi":"10.1109/TETC.2025.3528899","DOIUrl":"https://doi.org/10.1109/TETC.2025.3528899","url":null,"abstract":"Scatter-gather dynamic-memory-access (SG-DMA) is utilized in applications that require high bandwidth and low latency data transfers between memory and peripherals, where data blocks, described using buffer descriptors (BDs), are distributed throughout the memory system. The data transfer organization and requirements of a Trapped-Ion Quantum Computer (TIQC) possess characteristics similar to those targeted by SG-DMA. In particular, the ion qubits in a TIQC are manipulated by applying control sequences consisting primarily of modulated laser pulses. These optical pulses are defined by parameters that are (re)configured by the electrical control system. Variations in the operating environment and equipment make it necessary to create and run a wide range of control sequence permutations, which can be well represented as BD regions distributed across the main memory. In this article, we experimentally evaluate the latency and throughput of SG-DMA on Xilinx radiofrequency SoC (RFSoC) devices under a variety of BD and payload sizes as a means of determining the benefits and limitations of an RFSoC system architecture for TIQC applications.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"13 3","pages":"841-852"},"PeriodicalIF":5.4,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145050793","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fernando Fernandes dos Santos;Niccolò Cavagnero;Marco Ciccone;Giuseppe Averta;Angeliki Kritikakou;Olivier Sentieys;Paolo Rech;Tatiana Tommasi
{"title":"Improving Deep Neural Network Reliability via Transient-Fault-Aware Design and Training","authors":"Fernando Fernandes dos Santos;Niccolò Cavagnero;Marco Ciccone;Giuseppe Averta;Angeliki Kritikakou;Olivier Sentieys;Paolo Rech;Tatiana Tommasi","doi":"10.1109/TETC.2024.3520672","DOIUrl":"https://doi.org/10.1109/TETC.2024.3520672","url":null,"abstract":"Deep Neural Networks (DNNs) have revolutionized several fields, including safety- and mission-critical applications, such as autonomous driving and space exploration. However, recent studies have highlighted that transient hardware faults can corrupt the model's output, leading to high misprediction probabilities. Since traditional reliability strategies, based on modular hardware, software replications, or matrix multiplication checksum impose a high overhead, there is a pressing need for efficient and effective hardening solutions tailored for DNNs. In this article we present several network design choices and a training procedure that increase the robustness of standard deep models and thoroughly evaluate these strategies with experimental analyses on vision classification tasks. We name <italic>DieHardNet</i> the specialized DNN obtained by applying all our hardening techniques that combine knowledge from experimental hardware faults characterization and machine learning studies. We conduct extensive ablation studies to quantify the reliability gain of each hardening component in DieHardNet. We perform over 10,000 instruction-level fault injections to validate our approach and expose DieHardNet executed on GPUs to an accelerated neutron beam equivalent to more than 570,000 years of natural radiation. Our evaluation demonstrates that DieHardNet can reduce the critical error rate (i.e., errors that modify the inference) up to 100 times compared to the unprotected baseline model, without causing any increase in inference time.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"13 3","pages":"829-840"},"PeriodicalIF":5.4,"publicationDate":"2025-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145050993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Energy Efficient Approximate Computing Framework for DNN Acceleration Using a Probabilistic-Oriented Method","authors":"Pengfei Huang;Ke Chen;Chenghua Wang;Weiqiang Liu","doi":"10.1109/TETC.2024.3522307","DOIUrl":"https://doi.org/10.1109/TETC.2024.3522307","url":null,"abstract":"Approximate computing (AxC) has recently emerged as a successful approach for optimizing energy consumption in error-tolerant applications, such as deep neural networks (DNNs). The enormous model size and high computation cost of DNNs present significant challenges for deployment in energy-efficient and resource-constrained computing systems. Emerging DNN hardware accelerators based on AxC designs selectively approximate the non-critical segments of computation to address these challenges. However, a systematic and principled approach that incorporates domain knowledge and approximate hardware for optimal approximation is still lacking. In this paper, we propose a probabilistic-oriented AxC (PAxC) framework that provides high energy savings with acceptable quality by considering the overall probability effect of approximation. To achieve aggressive approximate designs, we utilize the minimum likelihood error to determine the AxC synergy profile at both application and circuit levels. This enables effective coordination of the trade-off between energy and accuracy. Compared with a baseline design, the power-delay product (PDP) is significantly reduced by up to 83.66% with an acceptable accuracy reduction. Simulation and a case study of the image process validate the effectiveness of the proposed framework.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"13 3","pages":"816-828"},"PeriodicalIF":5.4,"publicationDate":"2025-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145050820","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mingfu Xue;Can He;Yushu Zhang;Zhe Liu;Weiqiang Liu
{"title":"3D Invisible Cloak: A Robust Person Stealth Attack Against Object Detector in Complex 3D Physical Scenarios","authors":"Mingfu Xue;Can He;Yushu Zhang;Zhe Liu;Weiqiang Liu","doi":"10.1109/TETC.2024.3513392","DOIUrl":"https://doi.org/10.1109/TETC.2024.3513392","url":null,"abstract":"In this article, we propose a novel physical stealth attack against the person detectors in real world. For the first time, we consider the impacts of those complex and challenging 3D physical constraints (e.g., radian, wrinkle, occlusion, angle, etc.) on person stealth attacks, and propose 3D transformations to generate robust 3D invisible cloak. We launch the person stealth attacks in 3D physical space instead of 2D plane by printing the adversarial patches on real clothes. Anyone wearing the cloak can evade the detection of person detectors and achieve stealth under challenging and complex 3D physical scenarios. Experimental results in various indoor and outdoor physical scenarios show that, the proposed person stealth attack method is robust and effective even under those complex and challenging physical conditions, such as the cloak is wrinkled, obscured, curved, and from different/large angles. The attack success rate of the generated adversarial patch in digital domain (Inria dataset) is 86.56% against YOLO v2 and 80.32% against YOLO v5, while the static and dynamic stealth attack success rates of the generated 3D invisible cloak in physical world are 100%, 77% against YOLO v2 and 100%, 83.95% against YOLO v5, respectively, which are significantly better than state-of-the-art works.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"13 3","pages":"799-815"},"PeriodicalIF":5.4,"publicationDate":"2024-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145050821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alessio Carpegna;Alessandro Savino;Stefano Di Carlo
{"title":"Spiker+: A Framework for the Generation of Efficient Spiking Neural Networks FPGA Accelerators for Inference at the Edge","authors":"Alessio Carpegna;Alessandro Savino;Stefano Di Carlo","doi":"10.1109/TETC.2024.3511676","DOIUrl":"https://doi.org/10.1109/TETC.2024.3511676","url":null,"abstract":"Including Artificial Neural Networks in embedded systems at the edge allows applications to exploit Artificial Intelligence capabilities directly within devices operating at the network periphery, containing sensitive data within the boundaries of the edge device. This facilitates real-time decision-making, reduces latency and power consumption, and enhances privacy and security. Spiking Neural Networks (SNNs) offer a promising computing paradigm in these environments. However, deploying efficient SNNs in resource-constrained edge devices requires highly parallel and reconfigurable hardware implementations. We introduce Spiker+, a comprehensive framework for generating efficient, low-power, and low-area SNN accelerators on Field Programmable Gate Arrays for inference at the edge. Spiker+ presents a configurable multi-layer SNN hardware architecture, a library of highly efficient neuron architectures, and a design framework to enable easy, Python-based customization of accelerators. Spiker+ is tested on three benchmark datasets: MNIST, Spiking Heidelberg Dataset (SHD), and AudioMNIST. On MNIST, it outperforms state-of-the-art SNN accelerators in terms of resource allocation, with a requirement of 7,612 logic cells and 18 Block RAMS (BRAMs), and power consumption, draining only 180 mW, with comparable latency (780 <inline-formula><tex-math>$mu$</tex-math></inline-formula>s/img) and accuracy (97%). On SHD and AudioMNIST, Spiker+ requires 18,268 and 10,124 logic cells, respectively, requiring 51 and 16 BRAMs, consuming 430 mW and 290 mW, with an accuracy of 75% and 95%. These results underscore the significance of Spiker+ in the hardware-accelerated SNN landscape, making it an excellent solution for deploying configurable and tunable SNN architectures in resource and power-constrained edge applications.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"13 3","pages":"784-798"},"PeriodicalIF":5.4,"publicationDate":"2024-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10794606","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145051041","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ke Chen;Shanshan Liu;Weiqiang Liu;Fabrizio Lombardi;Nader Bagherzadeh
{"title":"Guest Editorial: Special Section on “Approximate Data Processing: Computing, Storage and Applications”","authors":"Ke Chen;Shanshan Liu;Weiqiang Liu;Fabrizio Lombardi;Nader Bagherzadeh","doi":"10.1109/TETC.2024.3488452","DOIUrl":"https://doi.org/10.1109/TETC.2024.3488452","url":null,"abstract":"","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"12 4","pages":"954-955"},"PeriodicalIF":5.1,"publicationDate":"2024-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10779333","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142777561","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"IEEE Transactions on Emerging Topics in Computing Information for Authors","authors":"","doi":"10.1109/TETC.2024.3499715","DOIUrl":"https://doi.org/10.1109/TETC.2024.3499715","url":null,"abstract":"","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"12 4","pages":"C2-C2"},"PeriodicalIF":5.1,"publicationDate":"2024-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10779345","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142777661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}