{"title":"Title Page I","authors":"","doi":"10.1109/async48570.2021.00001","DOIUrl":"https://doi.org/10.1109/async48570.2021.00001","url":null,"abstract":"","PeriodicalId":314811,"journal":{"name":"2021 27th IEEE International Symposium on Asynchronous Circuits and Systems (ASYNC)","volume":"294 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130802812","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Spencer Nelson, Sang Yun Kim, J. Di, Zhe Zhou, Zhihang Yuan, Guangyu Sun
{"title":"Reconfigurable ASIC Implementation of Asynchronous Recurrent Neural Networks","authors":"Spencer Nelson, Sang Yun Kim, J. Di, Zhe Zhou, Zhihang Yuan, Guangyu Sun","doi":"10.1109/ASYNC48570.2021.00015","DOIUrl":"https://doi.org/10.1109/ASYNC48570.2021.00015","url":null,"abstract":"In order to provide ASIC implementations of machine learning algorithms with certain degree of reconfigurability, such that applications like edge computing are able to incorporate these designs in contrast of FPGA implementations due to the requirements of lower power, cheaper cost, and improved security, this paper presents the methodologies used to implement a wide range of gated RNN configurations as a single reconfigurable asynchronous ASIC. This design utilizes the Multi-threshold NULL Convention Logic (MTNCL) asynchronous design paradigm. To create the design, the reconfigurable aspects were analyzed, and determinations were made on how to create individual reconfigurable design components and how to share the signal paths as well as the control that could best achieve the overall objective. The resulting implementation is being fabricated in the TSMC 65nm bulk CMOS process. Transistor-level simulations were performed to characterize the minimum and maximum sized configurations.","PeriodicalId":314811,"journal":{"name":"2021 27th IEEE International Symposium on Asynchronous Circuits and Systems (ASYNC)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125739336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Asynchronous Serial Infrastructure Using FPIO","authors":"Andrew Lines","doi":"10.1109/ASYNC48570.2021.00017","DOIUrl":"https://doi.org/10.1109/ASYNC48570.2021.00017","url":null,"abstract":"\"Tour Pin Input Output\" (FPIO) is a delayinsensitive asynchronous bit-serial multi-chip management protocol which surpasses \"Serial Peripheral Interface\" (SPI) by eliminating timing races, waiting for peripherals to acknowledge, using unidirectional fanout-1 full-swing signals, eliminating chip select wires, and scaling to rings of dozens of chips using less wiring. The message protocol over FPIO supports on-chip tree and ring topologies to connect internal interfaces such as virtualized wires, scan chains, NoC message interfaces, and circuit testers. Intel’s Loihi neuromorphic processor and related chips use this new infrastructure. We propose these protocols as an open standard for serial management, especially suited for asynchronous chips.","PeriodicalId":314811,"journal":{"name":"2021 27th IEEE International Symposium on Asynchronous Circuits and Systems (ASYNC)","volume":"42 11","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113936083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Towards Hazard-Free Multiplexer Based Implementation of Self-Timed Circuits","authors":"A. Kushnerov, Moti Medina, A. Yakovlev","doi":"10.1109/ASYNC48570.2021.00011","DOIUrl":"https://doi.org/10.1109/ASYNC48570.2021.00011","url":null,"abstract":"The cost of design, test and fabrication of self-timed circuits remains prohibitive for their wider adoption in practice. Addressing this issue, researchers are trying to find ways for rapid prototyping of self-timed circuits in FPGAs. Combinational logic is realized in FPGAs by look-up tables (LUTs), which are typically built as a binary tree of 2-way multiplexers (MUX 2:1). This brings us to the idea of using MUX 2:1 in self-timed designs particularly, in quasi-delay-insensitive (QDI) circuits. Multiplexers however, realize a binate (non-monotone) Boolean function and therefore may cause logic hazards. A standard way for preventing these hazards requires designing of special circuit for MUX 2:1. On the other hand, there are indirect evidences that the multiplexers in some commercial FPGAs are hazard-free. Based on this assumption, we propose an original approach for realizing a multi-input C-element, which is widely used in QDI circuits. This paves the way for using hazard-free MUX 2:1 in more complex self-timed elements. All the proposed circuits are designed and verified in a CAD tool Workcraft.","PeriodicalId":314811,"journal":{"name":"2021 27th IEEE International Symposium on Asynchronous Circuits and Systems (ASYNC)","volume":"361 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114766993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Self-timed Reinforcement Learning using Tsetlin Machine","authors":"A. Wheeldon, A. Yakovlev, R. Shafik","doi":"10.1109/ASYNC48570.2021.00014","DOIUrl":"https://doi.org/10.1109/ASYNC48570.2021.00014","url":null,"abstract":"We present a hardware design for the learning datapath of the Tsetlin machine algorithm, along with a latency analysis of the inference datapath. In order to generate a low energy hardware which is suitable for pervasive artificial intelligence applications, we use a mixture of asynchronous design techniques—including Petri nets, signal transition graphs, dualrail and bundled-data. The work builds on previous design of the inference hardware, and includes an in-depth breakdown of the automaton feedback, probability generation and Tsetlin automata. Results illustrate the advantages of asynchronous design in applications such as personalized healthcare and battery-powered internet of things devices, where energy is limited and latency is an important figure of merit. Challenges of static timing analysis in asynchronous circuits are also addressed.","PeriodicalId":314811,"journal":{"name":"2021 27th IEEE International Symposium on Asynchronous Circuits and Systems (ASYNC)","volume":"90 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116489258","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hierarchical Token Rings for Address-Event Encoding","authors":"P. Purohit, R. Manohar","doi":"10.1109/ASYNC48570.2021.00010","DOIUrl":"https://doi.org/10.1109/ASYNC48570.2021.00010","url":null,"abstract":"Address-event representation (AER) is an event-driven, neuromorphic inter-chip encoding and communication protocol originally proposed to communicate location and timing information of sparse neural events between neuromorphic chips. The protocol is widely used in bio-inspired, event-based vision sensors to communicate visual events. The same approach has been explored for scientific imaging applications, but the performance of existing encoding schemes degrades in the presence of large sensor arrays and a wide range of event rates. In this paper, we introduce a new AER encoding scheme based on hierarchical token-rings (HTR), which blends event-based and scanning approaches. We show that HTR offers significant improvement in latency, throughput, and power compared to existing tree-based approaches in the context of scientific imaging applications.","PeriodicalId":314811,"journal":{"name":"2021 27th IEEE International Symposium on Asynchronous Circuits and Systems (ASYNC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130903777","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mohamed Akrarai, Nils Margotat, G. Sicard, L. Fesquet
{"title":"An asynchronous hybrid pixel image sensor","authors":"Mohamed Akrarai, Nils Margotat, G. Sicard, L. Fesquet","doi":"10.1109/ASYNC48570.2021.00016","DOIUrl":"https://doi.org/10.1109/ASYNC48570.2021.00016","url":null,"abstract":"Today, most of the computer vision applications produce a huge computational load, which can become an issue for autonomous systems such as robots. This is mainly due to the image sensor readout, which permanently captures the image at a fixed rate and produces a relatively high throughput bitstream. Therefore, finding techniques minimizing the data throughput helps to drastically reduce power. Event-based image sensors are able to capture images with a low throughput bistream, thanks to a sample strategy eliminating temporal and spatial redundancies. This natively gives a data-compressed image, which favors lower storage and computation. This article presents an event-based image sensor incorporating a hybrid pixel matrix composed of two pixel types and an arbiterless asynchronous readout system. The results show an important bitstream reduction compared to that of a standard CMOS image sensor. A testchip of our event-based image sensor has been designed and is currently under fabrication.","PeriodicalId":314811,"journal":{"name":"2021 27th IEEE International Symposium on Asynchronous Circuits and Systems (ASYNC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128976473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Welcome Message from the Chairs","authors":"","doi":"10.1109/async48570.2021.00005","DOIUrl":"https://doi.org/10.1109/async48570.2021.00005","url":null,"abstract":"","PeriodicalId":314811,"journal":{"name":"2021 27th IEEE International Symposium on Asynchronous Circuits and Systems (ASYNC)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128631952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jilin Zhang, Mingxuan Liang, Jinsong Wei, Shaojun Wei, Hong Chen
{"title":"A 28nm Configurable Asynchronous SNN Accelerator with Energy-Efficient Learning","authors":"Jilin Zhang, Mingxuan Liang, Jinsong Wei, Shaojun Wei, Hong Chen","doi":"10.1109/ASYNC48570.2021.00013","DOIUrl":"https://doi.org/10.1109/ASYNC48570.2021.00013","url":null,"abstract":"In this paper, we put forward an energy-efficient configurable asynchronous SNN accelerator for energy-constrained applications, which includes 256 neurons and 131K synapses with 8-bit fixed point weight. To achieve high energy efficiency and on-chip learning ability, we propose a sparse target propagation (S-TP) algorithm and design the accelerator with Click-based bundled-data asynchronous circuits. The SNN accelerator is implemented in 28nm CMOS process, and the post place and router (post-PAR) simulation results indicate that the SNN accelerator achieves on-chip learning with inference power efficiency of 3.97 pJ/SOP and 95.7% classification accuracy on NMNIST test dataset, which outperforms prior neuromorphic on-chip learning systems.","PeriodicalId":314811,"journal":{"name":"2021 27th IEEE International Symposium on Asynchronous Circuits and Systems (ASYNC)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121146413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Fluid: An Asynchronous High-level Synthesis Tool for Complex Program Structures","authors":"Rui Li, Lincoln Berkley, Yihang Yang, R. Manohar","doi":"10.1109/ASYNC48570.2021.00009","DOIUrl":"https://doi.org/10.1109/ASYNC48570.2021.00009","url":null,"abstract":"Current high-level synthesis (HLS) tools that generate synchronous logic construct a state machine that schedules program operations in each clock cycle. Rather than this centralized approach, we are developing an HLS methodology tailored to high-performance asynchronous dataflow circuits building on prior work in dataflow synthesis. We propose a new solution to dataflow circuit generation needed when translating real-world programs with complex control flow. We implement our approach in the LLVM compiler framework, and show that our generated circuits achieve better performance in throughput and energy compared to a number of existing HLS tools. We also quantify the benefits of dataflow graph optimizations on the quality of the generated circuits.","PeriodicalId":314811,"journal":{"name":"2021 27th IEEE International Symposium on Asynchronous Circuits and Systems (ASYNC)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131848076","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}