{"title":"Automatic generation of compact formal properties for effective error detection","authors":"Michele Bertasi, G. D. Guglielmo, G. Pravadelli","doi":"10.1109/CODES-ISSS.2013.6659015","DOIUrl":"https://doi.org/10.1109/CODES-ISSS.2013.6659015","url":null,"abstract":"Several approaches exist in literature for automatic extraction of model behaviours represented in the form of formal properties. Some of them rely on static analysis of the source code, others dynamically mine specifications by analysing simulation traces. In both cases, most of them work at bit level and generate properties in the form of combinational or temporal relationships among Boolean expressions. Such techniques are suited only for gate-level or RTL HW models. There are also approaches working on system-level descriptions and SW programs, but they generate properties to express only the sequential ordering of communication function calls and events, while the functional part of the implementation is ignored. To fill in the gap, this paper presents a dynamic methodology that works on gate-level, RTL and system-level HW descriptions as well as embedded SW, independently from the design model and the abstraction level. The generated properties are in the form of temporal relationships among arithmetic and logic expressions involving traditional HW description language data types (i.e., bit and logic vectors) as well as data types typically adopted in system-level models and SW programs (i.e., integer, double and string). A ranking function is also defined to classify the mined properties according to their capability of capturing meaningful design behaviours. Experimental results have shown that the approach allows generating compact properties really useful to effectively detect errors in the design implementation.","PeriodicalId":163484,"journal":{"name":"2013 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132746006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Instruction set extensions for Dynamic Time Warping","authors":"Joseph Tarango, Eamonn J. Keogh, P. Brisk","doi":"10.1109/CODES-ISSS.2013.6659005","DOIUrl":"https://doi.org/10.1109/CODES-ISSS.2013.6659005","url":null,"abstract":"Processor specialization through application-specific instruction set customization can significantly improve performance while reducing energy. Due to the costs associated with semiconductor fabrication, specialized processors are only viable for products with high production volumes. The emergence of low-cost sensor-based computing products in recent years has created an urgent need to process time-series data with the utmost efficiency. Although most sensor data is fixed-point, the normalization process-an absolute necessity for highly accurate similarity search of time-series data-converts the data to floating-point in order to avoid a loss in precision. The sensors that collect time-series data are typically connected to low-power microcontrollers or RISC processors sans floating point units. The computational requirements of real-time similarity search would overwhelm such processors. To address this concern, we introduce a specialized instruction set for time-series data mining applications to a 32-bit embedded processor, yielding a 4.87x performance improvement and a 78% reduction in energy consumption compared to a highly optimized software implementation.","PeriodicalId":163484,"journal":{"name":"2013 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115244615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wei Zuo, Peng Li, Deming Chen, L. Pouchet, Shun'an Zhong, J. Cong
{"title":"Improving polyhedral code generation for high-level synthesis","authors":"Wei Zuo, Peng Li, Deming Chen, L. Pouchet, Shun'an Zhong, J. Cong","doi":"10.1109/CODES-ISSS.2013.6659002","DOIUrl":"https://doi.org/10.1109/CODES-ISSS.2013.6659002","url":null,"abstract":"High-level synthesis (HLS) tools are now capable of generating high-quality RTL codes for a number of programs. Nevertheless, for best performance aggressive program transformations are still required to exploit data reuse and enable communication/computation overlap. The polyhedral compilation framework has shown great promise in this area with the development of HLS-specific polyhedral transformation techniques and tools. However, all these techniques rely on polyhedral code generation to translate a schedule for the program's operations into an actual C code that is input to the HLS tool. In this work we study the changes to the state-of-the-art polyhedral code generator CLooG which are required to tailor it for HLS purposes. In particular, we develop various techniques to significantly improve resource utilization on the FPGA. We also develop a complete technique geared towards effective code generation of rectangularly tiled code, leading to further improvements in resource utilization. We demonstrate our techniques on a collection of affine benchmarks, reducing by 2x on average (up to 10x) the area used after high-level synthesis.","PeriodicalId":163484,"journal":{"name":"2013 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125964801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sven Goossens, Jasper Kuijsten, B. Akesson, K. Goossens
{"title":"A reconfigurable real-time SDRAM controller for mixed time-criticality systems","authors":"Sven Goossens, Jasper Kuijsten, B. Akesson, K. Goossens","doi":"10.5555/2555692.2555694","DOIUrl":"https://doi.org/10.5555/2555692.2555694","url":null,"abstract":"Verifying real-time requirements of applications is increasingly complex on modern Systems-on-Chips (SoCs). More applications are integrated into one system due to power, area and cost constraints. Resource sharing makes their timing behavior interdependent, and as a result the verification complexity increases exponentially with the number of applications. Predictable and composable virtual platforms solve this problem by enabling verification in isolation, but designing SoC resources suitable to host such platforms is challenging. This paper focuses on a reconfigurable SDRAM controller for predictable and composable virtual platforms. The main contributions are: 1) A run-time reconfigurable SDRAM controller architecture, which allows trade-offs between guaranteed bandwidth, response time and power. 2) A methodology for offering composable service to memory clients, by means of composable memory patterns. 3) A reconfigurable Time-Division Multiplexing (TDM) arbiter and an associated reconfiguration protocol. The TDM slot allocations can be changed at run time, while the predictable and composable performance guarantees offered to active memory clients are unaffected by the reconfiguration. The SDRAM controller has been implemented as a TLM-level SystemC model, and in synthesizable VHDL for use on an FPGA.","PeriodicalId":163484,"journal":{"name":"2013 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128071478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chien-Chung Ho, Po-Chun Huang, Yuan-Hao Chang, Tei-Wei Kuo
{"title":"A DRAM-flash index for native flash file systems","authors":"Chien-Chung Ho, Po-Chun Huang, Yuan-Hao Chang, Tei-Wei Kuo","doi":"10.1109/CODES-ISSS.2013.6658990","DOIUrl":"https://doi.org/10.1109/CODES-ISSS.2013.6658990","url":null,"abstract":"Index structures are widely used in file systems and database applications for efficient data management. This paper exploits the respective characteristics of DRAM and flash memory for tree index designs, for which a native file system is taken as an example target in the research. Different from DRAM caching or buffering of flash-memory access in the past work, a hybrid index design that resides over DRAM and flash memory simulaneously is proposed to improve system performance and space management. Tree nodes migrate between DRAM and flash memory, as needed, in response to user access pattern so as to optimize the performance and to reduce managing overhead. The capability of the proposed design is evaluated by a series of experiments, for which we have very encouraging results.","PeriodicalId":163484,"journal":{"name":"2013 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)","volume":"515 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123074510","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A cyber-physical system approach to artificial pancreas design","authors":"Mahboobeh Ghorbani, P. Bogdan","doi":"10.1109/CODES-ISSS.2013.6659004","DOIUrl":"https://doi.org/10.1109/CODES-ISSS.2013.6659004","url":null,"abstract":"Healthcare costs in the US are among the highest in the world. Widespread chronic diseases such as diabetes constitute a significant cause of rising healthcare costs. Despite the increased need for smart healthcare systems that monitor patients' body balance, there is no coherent theory that facilitates the design and optimization of efficient and robust cyber physical systems. In this paper, we propose a mathematical model for capturing the dynamics of blood glucose characteristics (e.g., time dependent fractal behavior) observed in real world measurements via fractional calculus concepts. Building on our time dependent fractal model, we propose a novel mathematical model as well as hardware architecture for an artificial pancreas that relies on solving a constrained multi-fractal optimal control problem for regulating insulin injection. We verify the accuracy of our mathematical model by comparing it to conventional nonfractal models using real world measurements and showing that the nonlinear optimal controller based on fractal calculus concepts is superior to nonfractal controllers. We also verified the feasibility of in silico realization of the proposed optimal control algorithm by prototyping on FPGA platform.","PeriodicalId":163484,"journal":{"name":"2013 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)","volume":"87 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122536460","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Automated, retargetable back-annotation for host compiled performance and power modeling","authors":"S. Chakravarty, Zhuoran Zhao, A. Gerstlauer","doi":"10.1109/CODES-ISSS.2013.6659023","DOIUrl":"https://doi.org/10.1109/CODES-ISSS.2013.6659023","url":null,"abstract":"With traditional cycle-accurate or instruction-set simulations of processors often being too slow, host-compiled or source-level software execution approaches have recently become popular. Such high-level simulations can achieve order of magnitude speedups, but approaches that can achieve highly accurate characterization of both power and performance metrics are lacking. In this paper, we propose a novel host-compiled simulation approach that provides close to cycle-accurate estimation of energy and timing metrics in a retargetable manner, using flexible, architecture description language (ADL) based reference models. Our automated flow considers typical front- and back-end optimizations by working at the compiler-generated intermediate representation (IR). Path-dependent execution effects are accurately captured through pairwise characterization and backannotation of basic code blocks with all possible predecessors. Results from applying our approach to PowerPC targets running various benchmark suites show that close to native average speeds of 2000 MIPS at more than 98% timing and energy accuracy can be achieved.","PeriodicalId":163484,"journal":{"name":"2013 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121015144","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Di Zhu, Siyu Yue, Yanzhi Wang, Younghyun Kim, N. Chang, Massoud Pedram
{"title":"Designing a residential hybrid electrical energy storage system based on the energy buffering strategy","authors":"Di Zhu, Siyu Yue, Yanzhi Wang, Younghyun Kim, N. Chang, Massoud Pedram","doi":"10.1109/CODES-ISSS.2013.6659019","DOIUrl":"https://doi.org/10.1109/CODES-ISSS.2013.6659019","url":null,"abstract":"Due to severe variation in load demand over time, utility companies generally raise electrical energy price during periods of high load demand. A grid-connected hybrid electrical energy storage (HEES) system can help residential users lower their electric bills by storing energy during low-price hours and releasing the stored energy during high-price hours. A HEES system consists of different types of electrical energy storage (EES) elements, utilizing the benefits of each type while hiding their weaknesses. This paper presents a residential energy management system to maximize the annual profits on residential electric bills, based on a HEES system comprised of a lead-acid battery bank as the main storage bank and a Li-ion battery bank as the energy buffer. We first derive the optimal daily energy management policy based on energy buffering to minimize the daily energy cost. Next, we find the near-optimal design specifications of the energy management system, aiming at maximizing the amortized annual profits under practical constraints. We show that this system achieves averagely 11.10% more profits compared to the none-buffering HEES system.","PeriodicalId":163484,"journal":{"name":"2013 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127245879","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Majid Namaki-Shoushtari, Abbas Rahimi, N. Dutt, Puneet Gupta, Rajesh K. Gupta
{"title":"ARGO: Aging-aware GPGPU register file allocation","authors":"Majid Namaki-Shoushtari, Abbas Rahimi, N. Dutt, Puneet Gupta, Rajesh K. Gupta","doi":"10.1109/CODES-ISSS.2013.6659017","DOIUrl":"https://doi.org/10.1109/CODES-ISSS.2013.6659017","url":null,"abstract":"State-of-the-art general-purpose graphic processing units (GPGPUs) implemented in nanoscale CMOS technologies offer very high computational throughput for highly-parallel applications using hundreds of integrated on-chip resources. These resources are stressed during application execution, subjecting them to degradation mechanisms such as negative bias temperature instability (NBTI) that adversely affect their reliability. To support highly parallel execution, GPGPUs contain large register files (RFs) that are among the most highly stressed GPGPU components; however we observe heavy underutilization of RFs (on average only 46%) for typical general-purpose kernels. We present ARGO, an Aging-awaRe GPGPU RF allOcator that opportunistically exploits this RF underutilization by distributing the stress throughout RF. ARGO achieves proper leveling of RF banks through deliberated power-gating of stressful banks. We demonstrate our technique on the AMD Evergreen GPGPU architecture and show that ARGO improves the NBTI-induced threshold voltage degradation by up to 43% (on average 27%), that yields improving RFs static noise margin up to 46% (on average 30%). Furthermore, we estimate a simultaneous reduction in leakage power of 54% by providing sleep states for unused banks.","PeriodicalId":163484,"journal":{"name":"2013 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115013152","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sebastian Graf, M. Glaß, Dominic Wintermann, J. Teich, C. Lauer
{"title":"IVaM: Implicit variant modeling and management for automotive embedded systems","authors":"Sebastian Graf, M. Glaß, Dominic Wintermann, J. Teich, C. Lauer","doi":"10.1109/CODES-ISSS.2013.6659011","DOIUrl":"https://doi.org/10.1109/CODES-ISSS.2013.6659011","url":null,"abstract":"In this paper, we propose a graph-based approach for the modeling and efficient analysis of functional variants of a car's electric and electronic (E/E) architecture functionality by combining local technical expert knowledge with global business knowledge. Starting with a variants system specification including a set of task graphs, linear constraints on binary variables are specified for their alternative selection as well as the selection of groups of alternatives called application groups. These constraints may stem from a certain domain knowledge, e. g., entertainment or power train domain, or global constraints. The typically vast space of resulting possible combinations of different selections of alternative behaviors will be termed variant space and those satisfying the set of formulated constraints valid variants. An important result of this paper is that the set of variants, and especially the set of valid variants, do not need to be modeled or stored explicitly but rather implicitly. Nevertheless do we show that using state-of-the-art PB solver techniques, we may determine the set of valid variants very efficiently. Each of these valid variants may subsequently be used as a candidate for design space exploration (DSE) in order to optimize also the mapping of the corresponding task graph functionalities to a final optimized E/E architecture. A real-world case study is provided to demonstrate the capabilities and efficiency of the presented approach on implicit variant modeling and analysis.","PeriodicalId":163484,"journal":{"name":"2013 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124270529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}