{"title":"A framework for reducing the modeling and simulation complexity of Cyberphysical Systems","authors":"N. Zompakis, K. Siozios","doi":"10.1109/SAMOS.2015.7363699","DOIUrl":"https://doi.org/10.1109/SAMOS.2015.7363699","url":null,"abstract":"As systems continue to evolve they rely less on human decision-making and more on computational intelligence. This trend in conjunction to the available technologies for providing advanced sensing, measurement, process control, and communication leads us towards the new field of Cyber-Physical System (CPS). Although these systems exhibit remarkable characteristics, the increased complexity imposed by numerous components and services makes their design extremely difficult. This paper proposes a software-supported framework for reducing the design complexity regarding the modeling, as well as the simulation of CPS. For this purpose, a novel technique based on system scenarios is applied. Evaluation results prove the effectiveness of introduced framework, as we achieve to reduce mentionable the modeling and simulation complexity with a controllable overhead in accuracy.","PeriodicalId":346802,"journal":{"name":"2015 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS)","volume":"175 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123508787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Visual processing sparks a new class of processors","authors":"Marco C. Jacobs","doi":"10.1109/SAMOS.2015.7363651","DOIUrl":"https://doi.org/10.1109/SAMOS.2015.7363651","url":null,"abstract":"Summary form only given. Augmented reality, gesture interfaces, and automotive driver assistance systems enable novel user experiences, safer rides, and new usage models. Bringing these systems to market requires a power-efficient architecture and many billions of operations per second of processing. Widely adopted processing architectures like CPUs and GPUs can't fulfill the requirements, sparking a new class of video and vision processors. In this talk we'll give a quick overview of applications, typical algorithms, and their implications on computer architecture. We will focus on the automotive market, where computer vision is the key technology to enabling autonoumous vehicles hitting the road.","PeriodicalId":346802,"journal":{"name":"2015 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127054070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nabil Hallou, Erven Rohou, P. Clauss, A. Ketterlin
{"title":"Dynamic re-vectorization of binary code","authors":"Nabil Hallou, Erven Rohou, P. Clauss, A. Ketterlin","doi":"10.1109/SAMOS.2015.7363680","DOIUrl":"https://doi.org/10.1109/SAMOS.2015.7363680","url":null,"abstract":"In many cases, applications are not optimized for the hardware on which they run. Several reasons contribute to this unsatisfying situation, including legacy code, commercial code distributed in binary form, or deployment on compute farms. In fact, backward compatibility of ISA guarantees only the functionality, not the best exploitation of the hardware. In this work, we focus on maximizing the CPU efficiency for the SIMD extensions and propose to convert automatically, and at runtime, loops vectorized for an older version of the SIMD extension to a newer one. We propose a lightweight mechanism, that does not include a vectorizer, but instead leverages what a static vectorizer previously did. We show that many loops compiled for x86 SSE can be dynamically converted to the more recent and more powerful AVX; as well as, how correctness is maintained with regards to challenges such as data dependences and reductions. We obtain speedups in line with those of a native compiler targeting AVX. The re-vectorizer is implemented inside a dynamic optimization platform; it is completely transparent to the user, does not require rewriting binaries, and operates during program execution.","PeriodicalId":346802,"journal":{"name":"2015 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS)","volume":"94 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113962524","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Physical design aware system level synthesis of hardware","authors":"Nasim Farahini, A. Hemani, Hasan Sohofi, Shuo Li","doi":"10.1109/SAMOS.2015.7363669","DOIUrl":"https://doi.org/10.1109/SAMOS.2015.7363669","url":null,"abstract":"In spite of decades of research, only a small percentage of hardware is designed using high-level synthesis because of the large gap between the abstraction levels of standard cells and algorithmic level. We propose a grid-based regular physical design platform composed of large grain hardened building blocks called SiLago blocks. This platform is divided into regions which are specialized for different functionalities like computation, storage, system control, etc. The characterized micro-architectural operations of the SiLago platform serve as the interface to meet-in-the-middle high-level and system-level syntheses framework. This framework was used to generate three hardware macro instances, derived from SiLago platform for three applications from signal processing domain. Results show two orders of magnitude improvements in efficiency of the system-level design space exploration and synthesis time, with average loss in design quality of 18% for energy and 54% for area compared to the commercial SOC flow.","PeriodicalId":346802,"journal":{"name":"2015 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS)","volume":"62 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134427185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Parallel SystemC simulation for ESL design using flexible time decoupling","authors":"Jan Weinstock, R. Leupers, G. Ascheid","doi":"10.1109/SAMOS.2015.7363702","DOIUrl":"https://doi.org/10.1109/SAMOS.2015.7363702","url":null,"abstract":"Engineers of next generation embedded systems heavily rely on virtual platforms as central tools in their design process. Yet, the ever increasing HW/SW complexity degrades the simulation performance of those platforms and threatens their viability as design tools. With multi-core workstations today being widely available, the transition towards parallel simulation technologies seems obvious. Recently published parallel SystemC simulators use time-decoupling to achieve high simulation performance on modern SMP machines. However, those simulators have to identify all cross-thread communication ahead of time. This work presents an approach how to overcome this limitation and to enable time-decoupled simulation for mainstream SystemC simulators, achieving a speedup of up to 3.4× on a quad-core host.","PeriodicalId":346802,"journal":{"name":"2015 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116565149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Decentralized diagnosis of permanent faults in automotive E/E architectures","authors":"Peter Waszecki, M. Lukasiewycz, S. Chakraborty","doi":"10.1109/SAMOS.2015.7363675","DOIUrl":"https://doi.org/10.1109/SAMOS.2015.7363675","url":null,"abstract":"This paper presents a novel decentralized approach for the diagnosis of permanent faults in automotive Electrical and Electronic (E/E) architectures. Both, the safety-critical real-time requirements and the distributed nature of these systems make fault tolerance in general and fault diagnosis in particular a crucial and challenging issue. At the same time, high unit numbers in manufacturing add cost efficiency as an important criterion during system design, which is conflicting with the use of often expensive explicit fault diagnosis hardware. To address these challenges, we propose a diagnosis framework that consists of two stages. In the first diagnosis determination stage, potential fault scenarios, such as defective Electronic Control Units (ECUs), are investigated to obtain a set of diagnosis functions. Specific diagnosis functions are used for each component in the network at runtime to determine whether a certain fault scenario is present. In the second diagnosis optimization stage, an optimization of diagnosis functions is proposed to determine trade-offs between diagnosis times and the number of monitored message streams. Experimental results based on 100 synthetic test cases give evidence of the feasibility and efficiency of the presented framework. Finally, an automotive case study demonstrates the practicability and details of our fault diagnosis approach.","PeriodicalId":346802,"journal":{"name":"2015 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132019936","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"High-level synthesizable dataflow MapReduce accelerator for FPGA-coupled data centers","authors":"D. Diamantopoulos, C. Kachris","doi":"10.1109/SAMOS.2015.7363656","DOIUrl":"https://doi.org/10.1109/SAMOS.2015.7363656","url":null,"abstract":"Manipulating big-data entries of emerging server workloads requires a design paradigm shift towards more aggressive system-level architecture solutions. From software perspective, the MapReduce framework is a prominent parallel data processing tool as the volume of data to analyze grows rapidly. FPGAs can be used to accelerate the processing of data and reduce significantly the power consumption. However, FPGAs have not been deployed in data centers due to the high programming complexity of hardware. In this paper we present HLSMapReduceFlow, i.e. a novel reconfigurable MapReduce accelerator that can be scaled-up to data centers and it can speedup the processing of Map computation kernels, while promising minimum energy footprint and high programming efficiency due to the use of HLS. We propose the complete decoupling of MapReduce's tasks data-paths to distinct buses, accessed from individual processing engines. Such a dataflow approach implies a holistic C/C++ to RTL domain-level MapReduce transition. In this work, we further extent HLS tools, with systematic source-to-source code annotation of HLS optimization directives, by adding as a state-of-art system-level implementation toolflow. The proposed architecture is implemented, mapped and evaluated to a Virtex-7 FPGA and shows that the proposed scheme can achieve up to 4.3× overall throughput improvement in MapReduce applications, while offering two orders of magnitude power/energy improvements compared to a high-end multi-core processor.","PeriodicalId":346802,"journal":{"name":"2015 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128638142","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Parallel program = operator + schedule + parallel data structure","authors":"K. Pingali","doi":"10.1109/SAMOS.2015.7363652","DOIUrl":"https://doi.org/10.1109/SAMOS.2015.7363652","url":null,"abstract":"Summary form only given. Multicore and manycore processors are now ubiquitous, but parallel programming remains as difficult as it was 30-40 years ago. In this talk, I will argue that these problems arise largely from the computation-centric abstractions that we currently use to think about parallelism. In their place, I will propose a novel data-centric foundation for parallel programming called the operator formulation in which algorithms are described in terms of unitary actions on data structures. This data-centric view of parallel algorithms shows that a generalized form of data-parallelism called amorphous data-parallelism is ubiquitous even in complex, irregular graph applications such as mesh generation and partitioning algorithms, graph analytics, and machine learning applications. Binding time considerations provide a unification of parallelization techniques ranging from static parallelization to speculative parallelization. We have built a system called Galois, based on these ideas, for exploiting amorphous data-parallelism on multicores and GPUs. I will present experimental results from our group as well as from other groups that are using the Galois system.","PeriodicalId":346802,"journal":{"name":"2015 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115428147","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
I. Pöllänen, Billy Braithwaite, Keijo Haataja, Tiia Ikonen, Pekka J. Toivanen
{"title":"Current analysis approaches and performance needs for whole slide image processing in breast cancer diagnostics","authors":"I. Pöllänen, Billy Braithwaite, Keijo Haataja, Tiia Ikonen, Pekka J. Toivanen","doi":"10.1109/SAMOS.2015.7363692","DOIUrl":"https://doi.org/10.1109/SAMOS.2015.7363692","url":null,"abstract":"In this paper, the current approaches and performance needs for whole slide image (WSI) analysis processing in breast cancer diagnostics are discussed. WSIs provide high resolution digital image data from the patient's diseased tissue. Digital whole slide images are typically very large and contain a high amount of information. Digitizing tissue specimen into the form of digital images allows the development and application of computational analysis algorithms. Biological tissues are complex with variance in tissue structures between healthy individuals as well as between patients with the same disease. Furthermore, the tissue preparation and digitization usually generates a lot of artifacts and more complexity, which causes classification challenges. This variance and also the large size of the images make creating an accurate and reliable automated breast cancer image analysis a challenge. In the ALMARVI project we aim at generating and implementing efficient histopathological image analysis algorithms in our breast cancer analysis scheme. This paper focuses on discussing relevant information concerning histopathological breast cancer diagnosis, and could also be considered as an introduction to the concept of WSI analysis to non-experts. Since the WSI sizes are very large (up to 40 GB with no compression) there are challenges on the computational analysis which requires computationally efficient tools and suitable approaches to relieve the problems caused by the large size of the images.","PeriodicalId":346802,"journal":{"name":"2015 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS)","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131324159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. Wagner, Rolf Meyer, R. Buchty, Mladen Berekovic
{"title":"A scriptable, standards-compliant reporting and logging extension for SystemC","authors":"J. Wagner, Rolf Meyer, R. Buchty, Mladen Berekovic","doi":"10.1109/SAMOS.2015.7363700","DOIUrl":"https://doi.org/10.1109/SAMOS.2015.7363700","url":null,"abstract":"The shift towards more and more complex System-on-Chips fosters high-level modeling (HLM) of new systems in order to provide required time-to-first-virtual-prototype and adequate simulation speed. Using HLM furthermore allows running exhaustive simulations are, enabling the developer to gain a plethora of information from the system during simulation. Reporting, logging, analyzing, and interpreting this vast amount of data requires a potent report and logging system. This paper proposes such a solution: the presented system is build on the foundations of SystemC's sc_report class and maintains full compatibility with it. To provide extensive search and analysis features, the proposed solution features Python-based scripting capabilities and supports attached key-value pairs to each report message. Using highly efficient black- and whitelisting filters empowers the user to reported events during runtime and suppresses all irrelevant reports in order to achieve fast simulation. Filter rules are fully scriptable and interpreted during simulation runtime, allowing dynamic adaption of the rules based on events occurred. All proposed mechanisms were evaluated under real-world conditions in an existing virtual prototype platform, including a report database backend, enabling easy analysis of the generated data.","PeriodicalId":346802,"journal":{"name":"2015 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS)","volume":"185 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120944966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}