U. Garlando, Marcel Walter, R. Wille, F. Riente, F. Sill, R. Drechsler
{"title":"ToPoliNano and fiction: Design Tools for Field-coupled Nanocomputing","authors":"U. Garlando, Marcel Walter, R. Wille, F. Riente, F. Sill, R. Drechsler","doi":"10.1109/DSD51259.2020.00071","DOIUrl":"https://doi.org/10.1109/DSD51259.2020.00071","url":null,"abstract":"Field-coupled Nanocomputing (FCN) is a computing concept with several promising post-CMOS candidate implementations that offer tremendously low power dissipation and highest processing performance at the same time. Two of the manifold physical implementations are Quantum-dot Cellular Automata (QCA) and Nanomagnet Logic (NML). Both inherently come with domain-specific properties and design constraints that render established conventional design algorithms inapplicable. Accordingly, dedicated design tools for those technologies are required. This paper provides an overview of two leading examples of such tools, namely fiction and ToPoliNano. Both tools provide effective methods that cover aspects such as placement, routing, clocking, design rule checking, verification, and logical as well as physical simulation. By this, both freely available tools provide platforms for future research in the FCN domain.","PeriodicalId":128527,"journal":{"name":"2020 23rd Euromicro Conference on Digital System Design (DSD)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127504484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Formal Model for the Automatic Configuration of Access Protection Units in MPSoC-Based Embedded Systems","authors":"Tobias Dörr, T. Sandmann, J. Becker","doi":"10.1109/DSD51259.2020.00098","DOIUrl":"https://doi.org/10.1109/DSD51259.2020.00098","url":null,"abstract":"Heterogeneous system-on-chip platforms with multiple processing cores are becoming increasingly common in safety-and security-critical embedded systems. To facilitate a logical isolation of physically connected on-chip components, internal communication links of such platforms are often equipped with dedicated access protection units. When performed manually, however, the configuration of these units can be both time-consuming and error-prone. To resolve this issue, we present a formal model and a corresponding design methodology that allows developers to specify access permissions and information flow requirements for embedded systems in a mostly platform-independent manner. As part of the methodology, the consistency between the permissions and the requirements is automatically verified and an extensible generation framework is used to transform the abstract permission declarations into configuration code for individual access protection units. We present a prototypical implementation of this approach and validate it by generating configuration code for the access protection unit of a commercially available multiprocessor system-on-chip.","PeriodicalId":128527,"journal":{"name":"2020 23rd Euromicro Conference on Digital System Design (DSD)","volume":"84 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128144122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Amna Gharbi, Andrea Enrici, B. Uscumlic, L. Apvrille, R. Pacalet
{"title":"Efficient and Exact Design Space Exploration for Heterogeneous and Multi-Bus Platforms","authors":"Amna Gharbi, Andrea Enrici, B. Uscumlic, L. Apvrille, R. Pacalet","doi":"10.1109/DSD51259.2020.00014","DOIUrl":"https://doi.org/10.1109/DSD51259.2020.00014","url":null,"abstract":"Design Space Exploration of data-flow Systems-on-Chip either focuses on classical shared bus or on complex network-on-chip (NoC) architectures. A lack of research work exists that targets segmented bus architectures. These offer performance improvements (latency, power consumption) with respect to a shared bus, while employing much simpler communication structures and algorithms than a NoC. Despite the lack in the research work, segmented buses are popular in multiprocessor systems and in FPGA interconnects. This paper fills this lack with two contributions. First, we propose a Satisfiability Modulo Theory (SMT) formulation. Secondly, we provide a technique to reduce the design-space explosion problem that is portable to other formulations (e.g., ILP, MILP) and to problems where the scheduling on units (e.g., bus, CPU) is multiplexed in time. We integrated these contributions in a state-of-the-art design tool that we employ for evaluation purposes with a set of streaming applications and a MPSoC platform. The resulting framework can study the performance of fixed interconnects as well as determine the optimal architecture among a set of candidates. Our reduction technique improves considerably the scalability of DSE. For our testbench, we reduce the SMT solver run-time from 20 up to 589 times.","PeriodicalId":128527,"journal":{"name":"2020 23rd Euromicro Conference on Digital System Design (DSD)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125816595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Novel Bloom filter algorithms and architectures for ultra-high-speed network security applications","authors":"Arish Sateesan, Jo Vliegen, J. Daemen, N. Mentens","doi":"10.1109/DSD51259.2020.00050","DOIUrl":"https://doi.org/10.1109/DSD51259.2020.00050","url":null,"abstract":"This paper proposes novel Bloom filter algorithms and FPGA architectures for high-speed searching applications. A Bloom filter is a memory structure that is used to test whether input search data are present in a table of stored data. Bloom filters are extensively used in network security solutions that apply traffic flow monitoring or deep packet inspection. Improving the speed of Bloom filters can therefore have a significant impact on the speed of many network applications. The most important components determining the speed of Bloom filters are hash functions. While hash functions in Bloom filters do not require strong cryptographic properties, they do need a minimized computational delay. We take on the challenge of developing ultra-high-speed Bloom filters on FPGAs by proposing a new noncryptographic hash function, called Xoodoo-NC, derived from the cryptographic permutation Xoodoo. Xoodoo-NC is a reducedround, reduced-state version of Xoodoo, inheriting Xoodoo’s desired avalanche properties and low logical depth, resulting in an ultra-low-latency non-cryptographic hash function. We evaluate the performance of Bloom filter architectures based on Xoodoo-NC on a Xilinx UltraScale+FPGA and we compare the performance and resource occupation to existing Bloom filter implementations. We additionally compare our results to memories that use the built-in CAM cores in Xilinx UltraScale+ FPGAs. Our proposed algorithmic and architectural advances lead to Bloom filters that, to the best of our knowledge, outperform all other FPGA-based solutions.","PeriodicalId":128527,"journal":{"name":"2020 23rd Euromicro Conference on Digital System Design (DSD)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126835259","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"mcQEMU: Time-Accurate Simulation of Multi-core platforms using QEMU","authors":"H. Carvalho, Geoffrey Nelissen, P. Zaykov","doi":"10.1109/DSD51259.2020.00024","DOIUrl":"https://doi.org/10.1109/DSD51259.2020.00024","url":null,"abstract":"Full-system emulators allow the execution of guest operating systems and applications without the need of having access to the real target hardware. For many applications, besides the correct functional modeling, the full-system emulator shall also be time-accurate. In this paper, we present a new full-system multi-core simulator that delivers time-accurate execution and preserves the functional correctness of guest application. The proposed solution is based on QEMU. We enriched QEMU with various time models of multi-core platforms. We call this new full-system simulator mcQEMU. mcQEMU supports guest CPUs with out-of-order and in-order architectures.We validated mcQEMU by emulating multi-core ARM processors in system mode. The time accuracy of mcQEMU is evaluated with the TACLeBench benchmark suite. From a timing prediction viewpoint, mcQEMU achieves an estimation error of only 15% in average when emulating the out-of-order i.MX6Quad processor by NXP. For full-system simulation, mcQEMU runs at 35 Mips for in-order architectures and 25 Mips for out-of-order ones. In user-mode simulation, mcQEMU can achieve up to 65 Mips.","PeriodicalId":128527,"journal":{"name":"2020 23rd Euromicro Conference on Digital System Design (DSD)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128431298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"DSD 2020 Committees","authors":"Antonio Núñez Ordóñez","doi":"10.1109/dsd51259.2020.00007","DOIUrl":"https://doi.org/10.1109/dsd51259.2020.00007","url":null,"abstract":"","PeriodicalId":128527,"journal":{"name":"2020 23rd Euromicro Conference on Digital System Design (DSD)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133786074","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Philipp Niemann, Alexandre A. A. de Almeida, G. Dueck, R. Drechsler
{"title":"Design Space Exploration in the Mapping of Reversible Circuits to IBM Quantum Computers","authors":"Philipp Niemann, Alexandre A. A. de Almeida, G. Dueck, R. Drechsler","doi":"10.1109/DSD51259.2020.00070","DOIUrl":"https://doi.org/10.1109/DSD51259.2020.00070","url":null,"abstract":"With more and more powerful quantum computers becoming available, there is an increasing interest in the efficient mapping of a given quantum circuit to a particular quantum computer (so-called technology mapping). In most cases, the limitations of the targeted quantum hardware have not been taken into account when generating these quantum circuits in the first place. Thus, the technology mapping is likely to induce a considerable overhead for such circuits. In this paper, we consider the realization of reversible circuits consisting of multiple-controlled Toffoli gates on IBM quantum computers. We show that choosing different quantum-level decompositions can indeed have a significant impact on the mapping overhead. Based on this observation, we present an approach to perform design space exploration to obtain quantum circuits with reduced overhead by exploiting information about the targeted quantum hardware as well as the reversible circuit. An experimental evaluation shows that this approach often leads to considerable reductions of the technology mapping overhead with negligible runtime.","PeriodicalId":128527,"journal":{"name":"2020 23rd Euromicro Conference on Digital System Design (DSD)","volume":"116 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133918610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Matteo Bertolino, R. Pacalet, L. Apvrille, Andrea Enrici
{"title":"Efficient Scheduling of FPGAs for Cloud Data Center Infrastructures","authors":"Matteo Bertolino, R. Pacalet, L. Apvrille, Andrea Enrici","doi":"10.1109/DSD51259.2020.00021","DOIUrl":"https://doi.org/10.1109/DSD51259.2020.00021","url":null,"abstract":"In modern cloud data centers, reconfigurable devices can be directly connected to the network of a data center. This configuration enables FPGAs to be rented for acceleration of data-intensive workloads. In this context, novel scheduling solutions are needed to maximize the utilization (profitability) of FPGAs, e.g., reduce latency and resource fragmentation. Algorithms that schedule groups of tasks (clusters, packs), rather than individual tasks (list scheduling), well match the functioning of FPGAs. Here, groups of tasks that execute together are interposed by hardware reconfigurations. In this paper, we propose a heuristic based on a novel method for grouping tasks. These are gathered around a high-latency task that hides the latency of remaining tasks within the same group. We evaluated our solution on a benchmark of almost 30000 random workloads, synthesized from realistic designs (i.e., topology, resource occupancy). For this testbench, on average, our heuristic produces optimum makespan solutions in 71.3% of the cases. It produces solutions for moderately constrained systems (i.e., the deadline falls within 10% of the optimum makespan) in 88.1% of the cases.","PeriodicalId":128527,"journal":{"name":"2020 23rd Euromicro Conference on Digital System Design (DSD)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132315692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Michael Werner, Igli Zeraliu, Zhao Han, S. Prebeck, Lorenzo Servadei, W. Ecker
{"title":"Optimized HW/FW Generation from an Abstract Register Interface Model","authors":"Michael Werner, Igli Zeraliu, Zhao Han, S. Prebeck, Lorenzo Servadei, W. Ecker","doi":"10.1109/DSD51259.2020.00017","DOIUrl":"https://doi.org/10.1109/DSD51259.2020.00017","url":null,"abstract":"The HW/SW interface is a common and crucial component in System-on-Chips, enabling the interaction between software and hardware. Generating architecture and firmware code of the interface from extended IP-XACT, SystemRDL, or proprietary formalism is an established technology. This paper describes a new area and performance optimization step in the HW/SW interface generation process that reduces the silicon area and hardware access time through firmware. Three improvements of the underlying formalism are applied to achieve the optimization: First, a decoupling of bit fields from registers, which allows the rearrangement of the memory layout easily. Second, the specification of hardware accesses, which constraints the bit field arrangement. Third, different implementations of bit field accesses, such as memory-mapped or via CPU special registers. The used generation framework follows the approach of model-driven architecture, which includes optimization. Initially, abstract models specify the requirements of the IP or the HW/SW interface. Transformations turn these models into platformindependent models of hardware and firmware. These models are further transformed into implementation-specific models of a target language, such as hardware description languages or C. The proposed optimization has been successfully applied to peripheral variants of a CPU subsystem used in an industrial demonstrator. An area reduction of 19% and a performance gain of 11% has been achieved by optimizing the interfaces.","PeriodicalId":128527,"journal":{"name":"2020 23rd Euromicro Conference on Digital System Design (DSD)","volume":"87 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132110445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"DSD 2020 List Reviewer Page","authors":"","doi":"10.1109/dsd51259.2020.00009","DOIUrl":"https://doi.org/10.1109/dsd51259.2020.00009","url":null,"abstract":"","PeriodicalId":128527,"journal":{"name":"2020 23rd Euromicro Conference on Digital System Design (DSD)","volume":"180 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124491903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}