ACM Journal on Emerging Technologies in Computing Systems最新文献

PUF-Based Digital Money with Propagation-of-Provenance and Offline Transfers Between Two Parties 基于 PUF 的数字货币，可在双方之间进行证明传播和离线转账

IF 2.2 4区计算机科学

ACM Journal on Emerging Technologies in Computing Systems Pub Date : 2024-05-24 DOI: 10.1145/3663676

Benjamin Bean, Cyrus Minwalla, Eirini Eleni Tsiropoulou, Jim Plusquellic

引用次数: 0

SAT-based Exact Modulo Scheduling Mapping for Resource-Constrained CGRAs 基于 SAT 的资源受限 CGRA 精确模数调度映射

IF 2.2 4区计算机科学

ACM Journal on Emerging Technologies in Computing Systems Pub Date : 2024-05-22 DOI: 10.1145/3663675

Cristian Tirelli, Juan Sapriza, Rubén Rodríguez Álvarez, Lorenzo Ferretti, Benoît Denkinger, Giovanni Ansaloni, José Miranda Calero, David Atienza, Laura Pozzi

{"title":"SAT-based Exact Modulo Scheduling Mapping for Resource-Constrained CGRAs","authors":"Cristian Tirelli, Juan Sapriza, Rubén Rodríguez Álvarez, Lorenzo Ferretti, Benoît Denkinger, Giovanni Ansaloni, José Miranda Calero, David Atienza, Laura Pozzi","doi":"10.1145/3663675","DOIUrl":"https://doi.org/10.1145/3663675","url":null,"abstract":"Coarse-Grain Reconfigurable Arrays (CGRAs) represent emerging low-power architectures designed to accelerate Compute-Intensive Loops (CILs). The effectiveness of CGRAs in providing acceleration relies on the quality of mapping: how efficiently the CIL is compiled onto the platform. State of the Art (SoA) compilation techniques utilize modulo scheduling to minimize the Iteration Interval (II) and use graph algorithms like Max-Clique Enumeration to address mapping challenges. Our work approaches the mapping problem through a satisfiability (SAT) formulation. We introduce the Kernel Mobility Schedule (KMS), an ad-hoc schedule used with the Data Flow Graph and CGRA architectural information to generate Boolean statements that, when satisfied, yield a valid mapping. Experimental results demonstrate SAT-MapIt outperforming SoA alternatives in almost 50% of explored benchmarks. Additionally, we evaluated the mapping results in a synthesizable CGRA design and emphasized the run-time metrics trends, i.e. energy efficiency and latency, across different CILs and CGRA sizes. We show that a hardware-agnostic analysis performed on compiler-level metrics can optimally prune the architectural design space, while still retaining Pareto-optimal configurations. Moreover, by exploring how implementation details impact cost and performance on real hardware, we highlight the importance of holistic software-to-hardware mapping flows, as the one presented herein.","PeriodicalId":50924,"journal":{"name":"ACM Journal on Emerging Technologies in Computing Systems","volume":"55 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2024-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141153005","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Towards practical superconducting accelerators for machine learning using U-SFQ 利用 U-SFQ 实现用于机器学习的实用超导加速器

IF 2.2 4区计算机科学

ACM Journal on Emerging Technologies in Computing Systems Pub Date : 2024-04-09 DOI: 10.1145/3653073

Patricia Gonzalez-Guerrero, Kylie Huch, Nirmalendu Patra, Thom Popovici, George Michelogiannakis

{"title":"Towards practical superconducting accelerators for machine learning using U-SFQ","authors":"Patricia Gonzalez-Guerrero, Kylie Huch, Nirmalendu Patra, Thom Popovici, George Michelogiannakis","doi":"10.1145/3653073","DOIUrl":"https://doi.org/10.1145/3653073","url":null,"abstract":"Most popular superconducting circuits operate on information carried by ps-wide, (boldsymbol{mu})V-tall, single flux quantum (SFQ) pulses. These circuits can operate at frequencies of hundreds of GHz with orders of magnitude lower switching energy than complementary-metal-oxide-semiconductors (CMOS). However, under the stringent area constraints of modern superconductor technologies, fully-fledged, CMOS-inspired superconducting architectures cannot be fabricated at large scales. Unary SFQ (U-SFQ) is an alternative computing paradigm that can address these area constraints. In U-SFQ, information is mapped to a combination of streams of SFQ pulses and in the temporal domain. In this work, we extend U-SFQ to introduce novel building blocks such as a multiplier and an accumulator. These blocks reduce area and power consumption by 2(times) and 4(times) compared with previously-proposed U-SFQ building blocks, and yield at least 97% area savings compared with binary approaches. Using these multiplier and adder, we propose a U-SFQ Convolutional Neural Network (CNN) hardware accelerator capable of comparable peak performance with state-of-the-art superconducting binary approach (B-SFQ) in 32(times) less area. CNNs can operate with 5-8 bits of resolution with no significant degradation in classification accuracy. For 5 bits of resolution, our proposed accelerator yields 5(times)-63(times) better performance than CMOS and 15(times)-173(times) better area efficiency than B-SFQ.","PeriodicalId":50924,"journal":{"name":"ACM Journal on Emerging Technologies in Computing Systems","volume":"106 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2024-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140580656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Towards Energy-Efficient Spiking Neural Networks: A Robust Hybrid CMOS-Memristive Accelerator 迈向高能效尖峰神经网络:一种稳健的cmos -记忆体混合加速器

IF 2.2 4区计算机科学

ACM Journal on Emerging Technologies in Computing Systems Pub Date : 2023-12-05 DOI: 10.1145/3635165

Fabiha Nowshin, Hongyu An, Yang Yi

{"title":"Towards Energy-Efficient Spiking Neural Networks: A Robust Hybrid CMOS-Memristive Accelerator","authors":"Fabiha Nowshin, Hongyu An, Yang Yi","doi":"10.1145/3635165","DOIUrl":"https://doi.org/10.1145/3635165","url":null,"abstract":"Spiking Neural Networks (SNNs) are energy-efficient artificial neural network models that can carry out data-intensive applications. Energy consumption, latency, and memory bottleneck are some of the major issues that arise in machine learning applications due to their data-demanding nature. Memristor-enabled Computing-In-Memory (CIM) architectures have been able to tackle the memory wall issue, eliminating the energy and time-consuming movement of data. In this work we develop a scalable CIM-based SNN architecture with our fabricated two-layer memristor crossbar array. In addition to having an enhanced heat dissipation capability, our memristor exhibits substantial enhancement of 10% to 66% in design area, power and latency compared to state-of-the-art memristors. This design incorporates an inter-spike interval (ISI) encoding scheme due to its high information density to convert the incoming input signals into spikes. Furthermore, we include a time-to-first-spike (TTFS) based output processing stage for its energy-efficiency to carry out the final classification. With the combination of ISI, CIM and TTFS, this network has a competitive inference speed of 2μs/image and can successfully classify handwritten digits with 2.9mW of power and 2.51pJ energy per spike. The proposed architecture with the ISI encoding scheme can achieve ∼10% higher accuracy than those of other encoding schemes in the MNIST dataset.","PeriodicalId":50924,"journal":{"name":"ACM Journal on Emerging Technologies in Computing Systems","volume":"48 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2023-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138537934","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

An Analysis of Various Design Pathways Towards Multi-Terabit Photonic On-Interposer Interconnects 多太比特光子中间层互连的各种设计途径分析

IF 2.2 4区计算机科学

ACM Journal on Emerging Technologies in Computing Systems Pub Date : 2023-12-01 DOI: 10.1145/3635031

Venkata Sai Praneeth Karempudi, Janibul Bashir, Ishan G Thakkar

{"title":"An Analysis of Various Design Pathways Towards Multi-Terabit Photonic On-Interposer Interconnects","authors":"Venkata Sai Praneeth Karempudi, Janibul Bashir, Ishan G Thakkar","doi":"10.1145/3635031","DOIUrl":"https://doi.org/10.1145/3635031","url":null,"abstract":"In the wake of dwindling Moore’s Law, to address the rapidly increasing complexity and cost of fabricating large-scale, monolithic systems-on-chip (SoCs), the industry has adopted dis-aggregation as a solution, wherein a large monolithic SoC is partitioned into multiple smaller chiplets that are then assembled into a large system-in-package (SiP) using advanced packaging substrates such as silicon interposer. For such interposer-based SiPs, there is a push to realize on-interposer inter-chiplet communication bandwidth of multi-Tb/s and end-to-end communication latency of no more than 10 ns. This push comes as the natural progression from some recent prior works on SiP design, and is driven by the proliferating bandwidth demand of modern data-intensive workloads. To meet this bandwidth and latency goal, prior works have focused on a potential solution of using the silicon photonic interposer (SiPhI) for integrating and interconnecting a large number of chiplets into an SiP. Despite the early promise, the existing designs of on-SiPhI interconnects still have to evolve by leaps and bounds to meet the goal of multi-Tb/s bandwidth. However, the possible design pathways, upon which such an evolution can be achieved, have not been explored in any prior works yet. In this paper, we have identified several design pathways that can help evolve on-SiPhI interconnects to achieve multi-Tb/s aggregate bandwidth. We perform an extensive link-level and system-level analysis in which we explore these design pathways in isolation and in different combinations of each other. From our link-level analysis, we have observed that the design pathways that simultaneously enhance the spectral range and optical power budget available for wavelength multiplexing can render aggregate bandwidth of up to 4 Tb/s per on-SiPhI link. We also show that such high-bandwidth on-SiPhI links can substantially improve the performance and energy-efficiency of the state-of-the-art CPU and GPU chiplets based SiPs.","PeriodicalId":50924,"journal":{"name":"ACM Journal on Emerging Technologies in Computing Systems","volume":"42 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138537966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Introduction to the Special Issue on Next-Generation On-Chip and Off-Chip Communication Architectures for Edge, Cloud and HPC 面向边缘、云和高性能计算的下一代片上和片外通信架构特刊简介

IF 2.2 4区计算机科学

ACM Journal on Emerging Technologies in Computing Systems Pub Date : 2023-10-31 DOI: 10.1145/3631144

John Kim, Tushar Krishna

引用次数: 0

Design-Time Reference Current Generation for Robust Spintronic-Based Neuromorphic Architecture 鲁棒自旋电子学神经形态架构的设计时参考电流生成

4区计算机科学

ACM Journal on Emerging Technologies in Computing Systems Pub Date : 2023-09-27 DOI: 10.1145/3625556

Soyed Tuhin Ahmed, Mahta Mayahinia, Michael Hefenbrock, Christopher Münch, Mehdi B. Tahoori

{"title":"Design-Time Reference Current Generation for Robust Spintronic-Based Neuromorphic Architecture","authors":"Soyed Tuhin Ahmed, Mahta Mayahinia, Michael Hefenbrock, Christopher Münch, Mehdi B. Tahoori","doi":"10.1145/3625556","DOIUrl":"https://doi.org/10.1145/3625556","url":null,"abstract":"Neural Networks (NN) can be efficiently accelerated in a neuromorphic fabric based on emerging resistive non-volatile memories (NVM), such as Spin Transfer Torque Magnetic RAM (STT-MRAM). Compared to other NVM technologies, STT-MRAM offers many benefits, such as fast switching, high endurance, and CMOS process compatibility. However, due to its low ON/OFF-ratio, process variations and runtime temperature fluctuations can lead to miss-quantizing the sensed current and in turn, degradation of inference accuracy. In this paper, we analyze the impact of the sensed accumulated current variation on the inference accuracy in Binary NNs and propose a design-time reference current generation method to improve the robustness of the implemented NN under different temperature and process variation scenarios (up to 125 °C). Our proposed method is robust to both process and temperature variations. The proposed method improves the accuracy of NN inference by up to (20.51% ) on the MNIST, Fashion-MNIST, and CIFAR-10 benchmark datasets in the presence of process and temperature variations without additional runtime hardware overhead compared to existing solutions.","PeriodicalId":50924,"journal":{"name":"ACM Journal on Emerging Technologies in Computing Systems","volume":"101 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135537913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Secure and Lightweight Authentication Protocol Using PUF for the IoT-based Wireless Sensor Network 基于PUF的物联网无线传感器网络安全轻量级认证协议

4区计算机科学

ACM Journal on Emerging Technologies in Computing Systems Pub Date : 2023-09-18 DOI: 10.1145/3624477

Sourav Roy, Dipnarayan Das, Bibhash Sen

{"title":"Secure and Lightweight Authentication Protocol Using PUF for the IoT-based Wireless Sensor Network","authors":"Sourav Roy, Dipnarayan Das, Bibhash Sen","doi":"10.1145/3624477","DOIUrl":"https://doi.org/10.1145/3624477","url":null,"abstract":"The wireless sensor network (WSN) has been gaining popularity for automation and performance improvement in different IoT-based applications. The resource-constrained nature and operating environment of IoT make the devices highly vulnerable to different attacks. On the other hand, the Physically Unclonable Function (PUF) helps to implement secure and lightweight authentication protocols for IoT. In this context, few computation-intensive authentication protocols are found in the literature that have addressed secure IoT communication in WSN. Besides, these protocols depend on the local storage of PUF-CRP, which is susceptible to security attacks. This work proposes a lightweight and secure authentication protocol for the IoT devices in WSN. A PUF and its machine learning (ML)–based soft model is integrated to ensure secure authentication and lightweight computation in WSN. PUF prevents physical attacks while carrying very less hardware fingerprints, and the ML-based PUF provides the desired resiliency against PUF identity-based attacks by eliminating the requirement of CRP-based storage. The proposed mechanism delivers two-way authentication while nullifying the attacks on IoT. The proposed protocol is implemented on Xilinx Artix-7 FPGA and Raspberry Pi for testability and performance evaluation. Experiment results and analysis signify its low-cost computations and lightweight features desired for IoT.","PeriodicalId":50924,"journal":{"name":"ACM Journal on Emerging Technologies in Computing Systems","volume":"17 8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135154324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

SkyBridge 2.0: A Fine-grained Vertical 3D-IC Technology for Future ICs SkyBridge 2.0:面向未来集成电路的细粒度垂直3D-IC技术

IF 2.2 4区计算机科学

ACM Journal on Emerging Technologies in Computing Systems Pub Date : 2023-08-31 DOI: 10.1145/3617501

Sachin Bhat, Mingyu Li, S. Kulkarni, C. A. Moritz

{"title":"SkyBridge 2.0: A Fine-grained Vertical 3D-IC Technology for Future ICs","authors":"Sachin Bhat, Mingyu Li, S. Kulkarni, C. A. Moritz","doi":"10.1145/3617501","DOIUrl":"https://doi.org/10.1145/3617501","url":null,"abstract":"Gate-all-around FETs are set to replace FinFETs to enable continued miniaturization of ICs in the deep nanometer regime. IMEC and IRDS roadmaps project that 3D integration of gate-all-around FETs is a key path for the IC industry beyond 2024. In this paper, we present SkyBridge 2.0, an IC technology featuring high density fine-grained 3D integration of vertical gate-all-around nanowire FETs, contacts, and interconnect while also solving 3D routability. We utilize industry-standard EDA tools to develop a customized design and technology co-optimization (DTCO) flow to design and evaluate SkyBridge 2.0. This DTCO flow covers process emulation of standard cells and SRAM to enable scalable manufacturing pathway, TCAD characterization of vertical nanowire FETs to obtain IV and CV characteristics, compact modeling accurately the device behavior, RC parasitic extraction of 3D interconnects and performance, power and area assessment using ring oscillators. The technology assessment using ring oscillators shows that SkyBridge 2.0 at the chosen design point, using 10nm nanowires, achieves ∼ 18% performance and 31% energy efficiency benefits compared to 7nm FinFET technology. Area analysis of logic cells shows up to 6x density benefits versus aggressively scaled 2D-CMOS cells. In addition to logic, we architect 3D SRAM to support low-power memory designs. SkyBridge 2.0 SRAM shows ∼ 20% improvement in read and write static noise margin, up to 3x lower leakage current and up to 4x density benefits compared to 7nm FinFET technology.","PeriodicalId":50924,"journal":{"name":"ACM Journal on Emerging Technologies in Computing Systems","volume":" ","pages":""},"PeriodicalIF":2.2,"publicationDate":"2023-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46656792","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Repercussions of Using DNN Compilers on Edge GPUs for Real Time and Safety Critical Systems: A Quantitative Audit 在边缘gpu上使用DNN编译器对实时和安全关键系统的影响:定量审计

IF 2.2 4区计算机科学

ACM Journal on Emerging Technologies in Computing Systems Pub Date : 2023-08-03 DOI: 10.1145/3611016

Omais Shafi, Mohammad Khalid Pandit, Amarjeet Saini, Gayathri Ananthanarayanan, Rijurekha Sen

引用次数: 1