{"title":"Area Efficient Skyrmion Logic Based Approximate Adder Architecture Design Methodology","authors":"Santhosh Sivasubramani;Bibekananda Paikaray;Mahathi Kuchibhotla;Arabinda Haldar;Chandrasekhar Murapaka;Amit Acharyya","doi":"10.1109/TETC.2024.3434723","DOIUrl":"10.1109/TETC.2024.3434723","url":null,"abstract":"In this study, the first of its kind skyrmion logic based area efficient approximate nanomagnetic (APN) adder architecture design methodology is introduced along with its implementation using theoretical modelling and micromagnetic simulations. We propose here for the first time, skyrmion based APN adder architecture design using only one majority gate reconfigured runtime (RR) using single layout. This low complex device structure is modelled using three inputs with the bilayer ferromagnet/heavy metal utilizing the exploitation of output reversal mechanism using magnetic tunnel junctions (MTJs) for read and write of skyrmions. The implementation is performed using this same device where current is passed through a metallic gate for control mechanism to achieve various logic functionalities. We also introduce here the boolean optimzation followed by mapping logic for the demonstration of skyrmion RRAPN adder alongside the majority logic gate. This proposed RRAPN adder architecture design possess low complexity in terms of utilization of resources aiding towards the reduction of number of majority logic gates (<inline-formula><tex-math>$ sim$</tex-math></inline-formula><inline-formula><tex-math>$60 %$</tex-math></inline-formula> device footprint reduction) and evaluated against standard error metrics. RRAPN adder architecture design proposed has its advantages with miniaturisation aided by enhanced lithographic process nodes, creating a new potential for nanomagnetic logic devices.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"13 2","pages":"525-536"},"PeriodicalIF":5.1,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141885345","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"CAT SNN: Conversion Aware Training for High Accuracy and Hardware Friendly Spiking Neural Networks","authors":"Dongwoo Lew;Jongsun Park","doi":"10.1109/TETC.2024.3435135","DOIUrl":"10.1109/TETC.2024.3435135","url":null,"abstract":"Among the various training algorithms for spiking neural network (SNN), ANN-to-SNN conversion gained popularity due to high accuracy and scalability to deep networks. By converting artificial neural network (ANN) to SNN and employing conversion loss reduction techniques, previous ANN-to-SNN conversion approaches achieved good accuracies. However, previous works do not consider the overheads to implement conversion loss reductions in hardware, thereby limiting its feasibility of hardware implementation. In this paper, we present conversion aware training (CAT), where SNN is simulated as closely as possible during ANN training for obtaining SNN-like ANN. So, our approach does not need any conversion loss reduction techniques after conversion, thus reducing hardware overhead while achieving state-of-the-art accuracies for SNNs using various neural coding methods. In addition, as an application of CAT for obtaining a hardware friendly SNN, we demonstrate a lightweight time-to-first-spike (TTFS) coding that adopts logarithmic computations enabled by CAT. An SNN processor that supports the logarithmic TTFS is implemented in 28nm CMOS process, achieving 91.7/67.9/57.4% accuracy and 486.7/503.6/1426uJ inference energy on CIFAR-10/100/Tiny-ImageNet, when running 5-bit logarithmic weight VGG-16. The key contributions are 1) proposing CAT as an ANN-to-SNN conversion guideline 2) applying CAT on various neural codings 3) presenting co-designed TTFS coding and processor.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"13 2","pages":"512-524"},"PeriodicalIF":5.1,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141885301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Acceleration of the Bootstrapping in TFHE by FPGA","authors":"Jian Zhang;Aijiao Cui;Yier Jin","doi":"10.1109/TETC.2024.3433473","DOIUrl":"10.1109/TETC.2024.3433473","url":null,"abstract":"Privacy-preserving computing is playing an ever-increasingly important role in various fields. A leading example of privacy-preserving computing is Fully Homomorphic Encryption (FHE). FHE enables arbitrary computations directly on the ciphertext. This guarantees that the original data will not be disclosed while processing the data. However, FHE brings in the high computation cost which, in turn, limits the application of FHE. Among all steps of FHE, bootstrapping is a critical operation yet a bottleneck for the FHE efficiency. Torus FHE (TFHE) was presented as a method which can compute arbitrary Boolean functions on ciphertext with fast gate bootstrapping. In this paper, we show an implementation of TFHE gate bootstrapping on ZYNQ ZCU102 FPGA board. The memory operation is specially organized to facilitate the implementation of the adopted Number Theoretic Transform (NTT) of external product. Each function involved in the TFHE gate bootstrapping is implemented at the register-transfer level (RTL), and each operation is carefully scheduled to maximize the parallelism. Experimental results show that with ZCU102 working at the frequency of 300MHz, the proposed scheme can bootstrap one bit within 1.9ms on average. Compared with the accelerated TFHE using the mainstream CPU, the proposed scheme shows a 5.0X speedup. If under the similar clock frequency, it presents 1.23X faster than cuFHE which is accelerated by GPU. The proposed scheme also shows other advantages such as high efficiency and better tradeoff than existing FPGA-based acceleration schemes.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"13 2","pages":"496-511"},"PeriodicalIF":5.1,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141868537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Quantum-Inspired Differential Evolution With Decoding Using Hashing for Efficient User Allocation in Edge Computing Environment","authors":"Marlom Bey;Pratyay Kuila;Banavath Balaji Naik","doi":"10.1109/TETC.2024.3433570","DOIUrl":"10.1109/TETC.2024.3433570","url":null,"abstract":"Modern apps require high computing resources for real-time data processing, allowing app users (AUs) to access real-time information. Edge computing (EC) provides dynamic computing resources to AUs for real-time data processing. However, due to resources and coverage constraints, edge servers (ESs) in specific areas can only serve a limited number of AUs. Hence, the app user allocation problem (AUAP) becomes challenging in the EC environment. This paper proposes a quantum-inspired differential evolution algorithm (QDE-UA) for efficient user allocation in the EC environment. The quantum vector is designed to provide a complete solution to the AUAP. The fitness function considers the minimum use of ES, user allocation rate (UAR), energy consumption, and load balance. Extensive simulations and hypotheses-based statistical analyses (ANOVA, Friedman test) are performed to show the significance of the proposed QDE-UA. The results indicate that QDE-UA outperforms the majority of the existing strategies with an average UAR improvement of 112.42%, and 140.62% enhancement in load balance while utilizing 13.98% fewer ESs. Due to the higher UAR, QDE-UA shows 59.28% higher total energy consumption on average. However, the lower energy consumption per AU is evidence of its energy efficiency.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"13 2","pages":"481-495"},"PeriodicalIF":5.1,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141868535","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Luigi Coppolino;Salvatore D'Antonio;Giovanni Mazzeo;Roberto Nardone;Luigi Romano;Mathieu Schmitt
{"title":"WASMBOX: A Lightweight Wasm-Based Runtime for Trustworthy Multi-Tenant Embedded Systems","authors":"Luigi Coppolino;Salvatore D'Antonio;Giovanni Mazzeo;Roberto Nardone;Luigi Romano;Mathieu Schmitt","doi":"10.1109/TETC.2024.3409817","DOIUrl":"10.1109/TETC.2024.3409817","url":null,"abstract":"Enabling multi-tenancy on edge devices is crucial for maximizing resource utilization, enhancing scalability, and reducing costs. However, it introduces the challenge of maintaining tenant isolation, preventing adverse inter-tenant effects and unauthorized resource access. Traditional multi-tenant solutions often struggle in embedded systems due to resource constraints, and current lightweight approaches suffer from performance, portability, and tenant density issues. We propose <monospace>WASMBOX</monospace>, a novel solution for sandboxing applications in multi-tenant embedded systems. It leverages WebAssembly to offer strong isolation, small attack surface, high portability, efficient resource usage, and near-native performance. Our system ensures both attack prevention and detection, using a patched WebAssembly System Interface for safe system call execution, and a monitoring layer for anomaly detection. Additionally, <monospace>WASMBOX</monospace> uses a Trusted Execution Environment for further isolating applications against escaping tenants and attesting to the integrity of WebAssembly applications. We validated our solution in a real-world case study with the <italic>SpaceApplications</i> company, aiming to adopt a multi-tenant model for its ISS-based micro-gravity research facility. The experimental evaluation compared <monospace>WASMBOX</monospace> with approaches relying on VMs, containers, and microkernel-based VMs. The obtained results show that <monospace>WASMBOX</monospace> has the lowest resource usage, the highest tenant density, the second lowest startup (preceded by microkernels), and execution time (preceded by containers).","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"13 2","pages":"467-480"},"PeriodicalIF":5.1,"publicationDate":"2024-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141934513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Special Section on Community Detection in Time-Varying Information and Computing Networks: Theory, Models, and Applications","authors":"Jia Wu;Jian Yang;Philip S. Yu;Carlo Condo","doi":"10.1109/TETC.2024.3395072","DOIUrl":"https://doi.org/10.1109/TETC.2024.3395072","url":null,"abstract":"<bold>Jia Wu</b>\u0000 received the PhD degree in computer science from the University of Technology Sydney, Ultimo, NSW, Australia. He is currently an ARC DECRA fellow with the Department of Computing, Macquarie University, Sydney, Australia. Prior to that, he was with the center for Artificial Intelligence, University of Technology Sydney. His current research interests include data mining and machine learning.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"12 2","pages":"402-402"},"PeriodicalIF":5.9,"publicationDate":"2024-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10552378","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141308671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gang Qu;Debdeep Mukhopadhyay;Nele Mentens;Weiqiang Liu
{"title":"Special Section on Emerging Topics in Hardware Computing Systems Security","authors":"Gang Qu;Debdeep Mukhopadhyay;Nele Mentens;Weiqiang Liu","doi":"10.1109/TETC.2024.3394668","DOIUrl":"https://doi.org/10.1109/TETC.2024.3394668","url":null,"abstract":"<bold>Gang Qu</b>\u0000 received the BS degree in mathematics from the University of Science and Technology of China (USTC), China, and the PhD degree in computer science from the University of California, Los Angeles (UCLA), USA. He is currently a professor with the Department of Electrical and Computer Engineering, University of Maryland, College Park, MD, USA, where he leads the Maryland Embedded Systems and Hardware Security Lab (MeshSec Lab) and the Wireless Sensor Laboratory. His research interests include hardware security and trust, artificial intelligence, security in vehicular systems, and the Internet of Things. He is also known for his work on wireless sensor networks, low power and energy efficient embedded system design.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"12 2","pages":"482-482"},"PeriodicalIF":5.9,"publicationDate":"2024-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10552413","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141308685","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Guest Editorial Navigating the Nexus of Cyber Security and Resilience","authors":"Francesco Flammini;Cristina Alcaraz","doi":"10.1109/TETC.2024.3402450","DOIUrl":"https://doi.org/10.1109/TETC.2024.3402450","url":null,"abstract":"Welcome to this special issue of the IEEE Transactions on Emerging Topics in Computing, dedicated to exploring the dynamic landscape of Cyber Security and Resilience. In an era where digital advancements are driving unprecedented connectivity and innovation, the imperative for robust cyber defenses and resilient systems has never been more pressing.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"12 2","pages":"558-558"},"PeriodicalIF":5.9,"publicationDate":"2024-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10552379","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141308672","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"IEEE Transactions on Emerging Topics in Computing Information for Authors","authors":"","doi":"10.1109/TETC.2024.3402764","DOIUrl":"https://doi.org/10.1109/TETC.2024.3402764","url":null,"abstract":"","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"12 2","pages":"C2-C2"},"PeriodicalIF":5.9,"publicationDate":"2024-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10552381","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141308654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"DECC: Delay-Aware Edge-Cloud Collaboration for Accelerating DNN Inference","authors":"Zirui Zhuang;Jianan Chen;Wenchao Xu;Qi Qi;Song Guo;Jingyu Wang;Lu Lu;Hongwei Yang;Jianxin Liao","doi":"10.1109/TETC.2024.3404551","DOIUrl":"10.1109/TETC.2024.3404551","url":null,"abstract":"Deep neural network (DNN)-enabled edge intelligence has been widely adopted to support a variety of smart applications because of its ability to preserve privacy and conserve communication efficiency. The dilemma is that DNN models can be too large to be deployed on computationally constrained edge devices, and the volume of raw data can be too large to be efficiently transmitted to a centralized server. Thus, it is of utter importance that edge devices and cloud servers collaborate with each other to achieve fast and dependable model inference. Current collaborative solutions separate the DNN into two parts, which are placed and executed at the edge and in the cloud, respectively. However, these separated parts are executed consecutively, and all subsequent layers have to wait for the output of the previous layer even if they are not directly connected, causing significant inference latency. We propose a delay-aware edge-cloud collaboration (DECC) algorithm to reorganize the execution of DNN layers. By dividing DNN into several independent branches and selecting the optimal partition points, we apply a pipeline approach to parallelize the execution of these branches to minimize the inference delay. Extensive experiments show that the DECC outperforms existing methods by significantly reducing inference latency and improving throughput.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"13 2","pages":"438-450"},"PeriodicalIF":5.1,"publicationDate":"2024-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141934402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}