{"title":"Deterministic event-based control of Virtual Platforms for MPSoC software debugging","authors":"L. Murillo, Robert Buecs, R. Leupers, G. Ascheid","doi":"10.1109/SAMOS.2015.7363697","DOIUrl":"https://doi.org/10.1109/SAMOS.2015.7363697","url":null,"abstract":"Virtual Platforms (VPs) are advantageous to develop and debug complex software for multi- and many-processor systems-on-chip (MPSoCs). VPs provide unrivalled controllability and visibility of the target, which can be exploited to examine bugs that cannot be reproduced easily in real hardware. However, VPs as used for debugging provide only traditional interfaces, such as step-based debuggers and traces, that do little to help with the enormous complexity of MPSoCs and their parallel software. Finding a bug is still largely left to the developer's experience and intuition, using manual means rather than automated solutions. To bridge this gap, this paper presents a novel VP debug visualization and control framework for concurrent software that allows examining and steering the target by means of an abstract representation of its inter-task interactions. Our framework reduces the effort required to understand complex concurrency patterns and helps to expose bugs.","PeriodicalId":346802,"journal":{"name":"2015 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125359237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
D. Soudris, S. Xydis, Christos Baloukas, A. Hadzidimitriou, I. Chouvarda, K. Stamatopoulos, N. Maglaveras, John Chang, Andreas Raptopoulos, D. Manset, B. Pierscionek, R. Kayyali, N. Philip, Tobias Becker, K. Vaporidi, Eumorphia Kondili, D. Georgopoulos, L. Sutton, R. Rosenquist, L. Scarfò, P. Ghia
{"title":"AEGLE: A big bio-data analytics framework for integrated health-care services","authors":"D. Soudris, S. Xydis, Christos Baloukas, A. Hadzidimitriou, I. Chouvarda, K. Stamatopoulos, N. Maglaveras, John Chang, Andreas Raptopoulos, D. Manset, B. Pierscionek, R. Kayyali, N. Philip, Tobias Becker, K. Vaporidi, Eumorphia Kondili, D. Georgopoulos, L. Sutton, R. Rosenquist, L. Scarfò, P. Ghia","doi":"10.1109/SAMOS.2015.7363682","DOIUrl":"https://doi.org/10.1109/SAMOS.2015.7363682","url":null,"abstract":"AEGLE project1 targets to build an innovative ICT solution addressing the whole data value chain for health based on: cloud computing enabling dynamic resource allocation, HPC infrastructures for computational acceleration and advanced visualization techniques. In this paper, we provide an analysis of the addressed Big Data health scenarios and we describe the key enabling technologies, as well as data privacy and regulatory issues to be integrated into AEGLE's ecosystem, enabling advanced health-care analytic services, while also promoting related research activities.","PeriodicalId":346802,"journal":{"name":"2015 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125856451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Benedikt Janßen, Fynn Schwiegelshohn, Martijn Koedam, François Duhem, Leonard Masing, Stephan Werner, Christophe Huriaux, A. Courtay, Emilie Wheatley, K. Goossens, F. Lemonnier, P. Millet, J. Becker, O. Sentieys, M. Hübner
{"title":"Designing applications for heterogeneous many-core architectures with the FlexTiles Platform","authors":"Benedikt Janßen, Fynn Schwiegelshohn, Martijn Koedam, François Duhem, Leonard Masing, Stephan Werner, Christophe Huriaux, A. Courtay, Emilie Wheatley, K. Goossens, F. Lemonnier, P. Millet, J. Becker, O. Sentieys, M. Hübner","doi":"10.1109/SAMOS.2015.7363683","DOIUrl":"https://doi.org/10.1109/SAMOS.2015.7363683","url":null,"abstract":"The FlexTiles Platform has been developed within a Seventh Framework Programme project which is co-funded by the European Union with ten participants of five countries. It aims to create a self-adaptive heterogeneous many-core architecture which is able to dynamically manage load balancing, power consumption and faulty modules. Its focus is to make the architecture efficient and to keep programming effort low. Therefore, the concept contains a dedicated automated tool-flow for creating both the hardware and the software, a simulation platform that can execute the same binaries as the FPGA prototype and a virtualization layer to manage the final heterogeneous many-core architecture for run-time adaptability. With this approach software development productivity can be increased and thus, the time-to-market and development costs can be decreased. In this paper we present the FlexTiles Development Platform with a many-core architecture demonstration. The steps to implement, validate and integrate two use-cases are discussed.","PeriodicalId":346802,"journal":{"name":"2015 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126777753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bridging the semantic gap between heterogeneous modeling formalisms and FMI","authors":"S. Tripakis","doi":"10.1109/SAMOS.2015.7363660","DOIUrl":"https://doi.org/10.1109/SAMOS.2015.7363660","url":null,"abstract":"FMI (Functional Mockup Interface) is a standard for exchanging and co-simulating model components (called FMUs) coming from potentially different modeling formalisms, languages, and tools. Previous work has proposed a formal model for the co-simulation part of the FMI standard, and also presented two co-simulation algorithms which can be proven to have desirable properties, such as determinacy, provided the FMUs satisfy a formal contract. In this paper we discuss the principles for encoding different modeling formalisms, including state machines (both untimed and timed), discrete-event systems, and synchronous dataflow, as FMUs. The challenge is to bridge the various semantic gaps (untimed vs. timed, signals vs. events, etc.) that arise because of the heterogeneity between these modeling formalisms and the FMI API.","PeriodicalId":346802,"journal":{"name":"2015 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126835695","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Video chain demonstrator on Xilinx Kintex7 FPGA with EdkDSP floating point accelerators","authors":"J. Kadlec","doi":"10.1109/SAMOS.2015.7363690","DOIUrl":"https://doi.org/10.1109/SAMOS.2015.7363690","url":null,"abstract":"This paper briefly describes basic Kintex7 FPGA video pipe infrastructure for UTIA demonstrator in the ARTEMIS JU project ALMARVI. The video pipeline is combined with the run-time reprogrammable vector floating point EdkDSP accelerators on the same FPGA chip.","PeriodicalId":346802,"journal":{"name":"2015 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116593238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Behrouzian, Dip Goswami, T. Basten, M. Geilen, Hadi Alizadeh Ara
{"title":"Multi-Constraint multi-processor Resource Allocation","authors":"A. Behrouzian, Dip Goswami, T. Basten, M. Geilen, Hadi Alizadeh Ara","doi":"10.1109/SAMOS.2015.7363695","DOIUrl":"https://doi.org/10.1109/SAMOS.2015.7363695","url":null,"abstract":"This work proposes a Multi-Constraint Resource Allocation (MuCoRA) method for applications from multiple domains onto multi-processors. In particular, we address a mapping problem for multiple throughput-constrained streaming applications and multiple latency-constrained feedback control applications onto a multi-processor platform running under a Time-Division Multiple-Access (TDMA) policy. The main objective of the proposed method is to reduce resource usage while meeting constraints from both these two domains (i.e., throughput and latency constraints). We show by experiments that the overall resource usage for this mapping problem can be reduced by distributing the allocated resource (i.e., TDMA slots) to the control applications over the TDMA wheel instead of allocating consecutive slots.","PeriodicalId":346802,"journal":{"name":"2015 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS)","volume":"129 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130494590","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xinnian Zheng, Pradeep Ravikumar, L. John, A. Gerstlauer
{"title":"Learning-based analytical cross-platform performance prediction","authors":"Xinnian Zheng, Pradeep Ravikumar, L. John, A. Gerstlauer","doi":"10.1109/SAMOS.2015.7363659","DOIUrl":"https://doi.org/10.1109/SAMOS.2015.7363659","url":null,"abstract":"As modern processors are becoming increasingly complex, fast and accurate performance prediction is crucial during the early phases of hardware and software co-development. To accurately and efficiently predict the performance of a given software workload is, however, a challenging problem. Traditional cycle-accurate simulation is often too slow, while analytical models are not sufficiently accurate or still require target-specific execution statistics that may be slow or difficult to obtain. In this paper, we propose a novel learning-based approach for synthesizing analytical models that can accurately predict the performance of a workload on a target platform from various performance statistics obtained directly on a host platform using built-in hardware counters. Our learning approach relies on a one-time training phase using a cycle-accurate reference of the chosen target processor. We train our models on over 15,000 program instances from the ACM-ICPC programming contest database, and demonstrate the prediction accuracy on standard benchmark suites. Result show that our approach achieves on average more than 90% accuracy at 160× the speed compared to a cycle-accurate reference simulation.","PeriodicalId":346802,"journal":{"name":"2015 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS)","volume":"79 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121580095","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An interval algebra for multiprocessor resource allocation","authors":"L. Indrusiak, P. Dziurzański","doi":"10.1109/SAMOS.2015.7363672","DOIUrl":"https://doi.org/10.1109/SAMOS.2015.7363672","url":null,"abstract":"This paper presents an interval algebra created specifically to evaluate timing properties of multiprocessor systems. It models the application load as intervals, and considers allocation and scheduling as algebraic operations over those intervals, aiming to analyse the impact of resource allocation decisions on application response times or schedulability. The theoretical background is introduced informally, followed by the description of a reference implementation of the interval algebra in C++, aiming to appeal to the design practitioner rather than the formalist. Examples of the usage of the proposed algebra are also provided, showing its applicability to the performance evaluation of industrial systems implemented over bus-based and Network-on-Chip multiprocessor platforms. A particular design flow is highlighted, where the interval algebra is used as a fitness function in a genetic algorithm tailored to optimise resource allocation in hard real-time multiprocessors.","PeriodicalId":346802,"journal":{"name":"2015 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127576335","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Joonas Multanen, T. Viitanen, Henry Linjamaki, Heikki O. Kultala, P. Jääskeläinen, J. Takala, L. Koskinen, Jesse Simonsson, H. Berg, K. Raiskila, Tommi Zetterman
{"title":"Power optimizations for transport triggered SIMD processors","authors":"Joonas Multanen, T. Viitanen, Henry Linjamaki, Heikki O. Kultala, P. Jääskeläinen, J. Takala, L. Koskinen, Jesse Simonsson, H. Berg, K. Raiskila, Tommi Zetterman","doi":"10.1109/SAMOS.2015.7363689","DOIUrl":"https://doi.org/10.1109/SAMOS.2015.7363689","url":null,"abstract":"Power consumption in modern processor design is a key aspect. Optimizing the processor for power leads to direct savings in battery energy consumption in case of mobile devices. At the same time, many mobile applications demand high computational performance. In case of large scale computing, low power compute devices help in thermal design and in reducing the electricity bill. This paper presents a case study of a customized low power vector processor design that was synthesized on a 28 nm process technology. The processor has a programmer exposed datapath based on the transport triggered architecture programming model. The paper's focus is on the RTL and microarchitecture level power optimizations applied to the design. Using semiautomated interconnection network and register file optimization algorithm, up to 27% of power savings were achieved. Using this as a baseline and applying register file datapath gating, register file banking and enabling clock gating of individual pipeline stages in pipelined function units, up to 26% of power and energy savings could be achieved with only a 3% area overhead. On top of this, for the measured radio applications, the exposed datapath architecture helped to achieve approximately 18% power improvement in comparison to a VLIW-like architecture by utilizing optimizations unique to transport triggered architectures.","PeriodicalId":346802,"journal":{"name":"2015 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133664542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
D. Pfefferkorn, Achim Schmider, G. P. Vayá, M. Neuenhahn, H. Blume
{"title":"FNOCEE: A framework for NoC evaluation by FPGA-based emulation","authors":"D. Pfefferkorn, Achim Schmider, G. P. Vayá, M. Neuenhahn, H. Blume","doi":"10.1109/SAMOS.2015.7363663","DOIUrl":"https://doi.org/10.1109/SAMOS.2015.7363663","url":null,"abstract":"This paper introduces FNOCEE, a framework for the evaluation of NoC-based many-cores systems by FPGA-based emulation. It uses a task graph-oriented approach to model applications, while a hardware-accelerated genetic algorithm is employed to find close-to-optimal solutions to the task mapping problem. The proposed genetic algorithm is analyzed in detail, e.g., in terms of mutation rate and number of elite individuals. In order to illustrate the framework's capabilities, several case studies have been performed, wherein scalability of relevant parallel applications is investigated with regard to the number and type of available processing cores and the generated traffic load as a result of inter-task communication.","PeriodicalId":346802,"journal":{"name":"2015 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123415686","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}