Scalable Embedded Systems: Towards the Convergence of High-Performance and Embedded Computing

2015 IEEE 13th International Conference on Embedded and Ubiquitous Computing Pub Date : 2015-10-21 DOI:10.1109/EUC.2015.34

R. Giorgi

{"title":"Scalable Embedded Systems: Towards the Convergence of High-Performance and Embedded Computing","authors":"R. Giorgi","doi":"10.1109/EUC.2015.34","DOIUrl":null,"url":null,"abstract":"Embedded System toolchains are highly customized for a specific System-on-Chip (SoC). When the application needs more performance, the designer is typically forced to adopt a new SoC and possibly another toolchain. The rationale for not scaling performance by using, e.g., two SoCs, is that maintining most of the operations on-chip may allow for higher energy efficiency. We are exploring the feasibility and trade-offs of designing and manufacturing a new Single Board Computer (SBC) that could serve flexibly for a number of current and future applications, by allowing scalability through clusters of SBCs while keeping the same programming model for the SBC. This board is based on FPGAs and embedded processors, and its key points are: i) a fast custom interconnect for board-to-board communication and ii) an easily programmable environment which would allow both the off-loading of code into accelerators (either soft-IP blocks or hard-IP blocks) and, at the same time, the distribution of computation across boards. A key challenge to successfully deploying this paradigm is to properly distribute the threads across several boards without the explicit intervention of the programmer. In this paper we describe how to dynamically and efficiently distribute the computational threads in symbiosis with an appropriate memory model to allow the system scalability, so that we can double the performance by simply connecting two boards without i) changing the basic hardware components (e.g., to a different System-On-Chip) and ii) changing the programming model to follow the vendor specific toolchain. Our approach is to reduce data movement across boards. Our initial experiments have confirmed the feasibility of our approach.","PeriodicalId":299207,"journal":{"name":"2015 IEEE 13th International Conference on Embedded and Ubiquitous Computing","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"21","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE 13th International Conference on Embedded and Ubiquitous Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/EUC.2015.34","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 21

Abstract

Embedded System toolchains are highly customized for a specific System-on-Chip (SoC). When the application needs more performance, the designer is typically forced to adopt a new SoC and possibly another toolchain. The rationale for not scaling performance by using, e.g., two SoCs, is that maintining most of the operations on-chip may allow for higher energy efficiency. We are exploring the feasibility and trade-offs of designing and manufacturing a new Single Board Computer (SBC) that could serve flexibly for a number of current and future applications, by allowing scalability through clusters of SBCs while keeping the same programming model for the SBC. This board is based on FPGAs and embedded processors, and its key points are: i) a fast custom interconnect for board-to-board communication and ii) an easily programmable environment which would allow both the off-loading of code into accelerators (either soft-IP blocks or hard-IP blocks) and, at the same time, the distribution of computation across boards. A key challenge to successfully deploying this paradigm is to properly distribute the threads across several boards without the explicit intervention of the programmer. In this paper we describe how to dynamically and efficiently distribute the computational threads in symbiosis with an appropriate memory model to allow the system scalability, so that we can double the performance by simply connecting two boards without i) changing the basic hardware components (e.g., to a different System-On-Chip) and ii) changing the programming model to follow the vendor specific toolchain. Our approach is to reduce data movement across boards. Our initial experiments have confirmed the feasibility of our approach.

查看原文本刊更多论文

可扩展嵌入式系统:迈向高性能和嵌入式计算的融合

嵌入式系统工具链是为特定的片上系统(SoC)高度定制的。当应用程序需要更高的性能时，设计人员通常被迫采用新的SoC和可能的另一个工具链。不通过使用(例如两个soc)来扩展性能的理由是，在芯片上维持大多数操作可能会允许更高的能源效率。我们正在探索设计和制造一种新的单板计算机(SBC)的可行性和权衡，这种计算机可以灵活地为许多当前和未来的应用服务，通过允许SBC集群的可扩展性，同时保持SBC的相同编程模型。该板基于fpga和嵌入式处理器，其关键点是:i)板对板通信的快速自定义互连和ii)易于编程的环境，该环境允许将代码卸载到加速器(软ip块或硬ip块)中，同时，跨板的计算分布。成功部署此范例的一个关键挑战是，在没有程序员显式干预的情况下，正确地将线程分布在多个电路板上。在本文中，我们描述了如何动态有效地分布计算线程与一个适当的内存模型共生，以允许系统的可扩展性，因此我们可以通过简单地连接两个板，而无需i)改变基本硬件组件(例如，到一个不同的片上系统)和ii)改变编程模型，以遵循供应商特定的工具链，从而使性能翻倍。我们的方法是减少跨部门的数据移动。我们的初步实验证实了我们方法的可行性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2015 IEEE 13th International Conference on Embedded and Ubiquitous Computing

自引率

0.00%

发文量