J. Cortadella, A. Kondratyev, L. Lavagno, K. Lwin, C. Sotiriou
{"title":"From synchronous to asynchronous: an automatic approach","authors":"J. Cortadella, A. Kondratyev, L. Lavagno, K. Lwin, C. Sotiriou","doi":"10.1109/DATE.2004.1269092","DOIUrl":"https://doi.org/10.1109/DATE.2004.1269092","url":null,"abstract":"This paper presents a methodology to derive asynchronous circuits from optimized synchronous circuits by replacing the clock distribution tree by a handshaking network. A case study shows the applicability of the method and the potential benefits of de-synchronizing synchronous circuits.","PeriodicalId":335658,"journal":{"name":"Proceedings Design, Automation and Test in Europe Conference and Exhibition","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126624747","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. Ruiz-Amaya, J. M. de la Rosa, F. Medeiro, F. Fernández, R. del Río, B. Pérez-Verdú, Á. Rodríguez-Vázquez
{"title":"MATLAB/SIMULINK-based high-level synthesis of discrete-time and continuous-time /spl Sigma//spl Delta/ modulators","authors":"J. Ruiz-Amaya, J. M. de la Rosa, F. Medeiro, F. Fernández, R. del Río, B. Pérez-Verdú, Á. Rodríguez-Vázquez","doi":"10.1109/DATE.2004.1269222","DOIUrl":"https://doi.org/10.1109/DATE.2004.1269222","url":null,"abstract":"This paper describes a tool that combines an accurate SIMULINK-based time-domain behavioural simulator with a statistical optimizer for the automated high-level synthesis of /spl Sigma//spl Delta/ modulators (/spl Sigma//spl Delta/Ms). The combination of high accuracy, short CPU time and interoperability of different circuit models together with the efficiency of the optimization engine makes the proposed tool an advantageous alternative for /spl Sigma//spl Delta/M synthesis. The implementation on the well-known MATLAB/SIMULINK platform brings numerous advantages in terms of data manipulation, flexibility and simulation with other electronic subsystems. Moreover, this is the first tool dealing with the synthesis of /spl Sigma//spl Delta/Ms using both discrete-time (DT) and continuous-time (CT) circuit techniques.","PeriodicalId":335658,"journal":{"name":"Proceedings Design, Automation and Test in Europe Conference and Exhibition","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114954109","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A low cost individual-well adaptive body bias (IWABB) scheme for leakage power reduction and performance enhancement in the presence of intra-die variations","authors":"Tom W. Chen, Justin Gregg","doi":"10.1109/DATE.2004.1268855","DOIUrl":"https://doi.org/10.1109/DATE.2004.1268855","url":null,"abstract":"This paper presents a new method of adapting body biasing on a chip during post-fabrication testing in order to mitigate the effects of process variations. Individual well biasing voltages can be changed to be connected either to a chip wide well bias or to a different bias voltage through a self-regulating mechanism, allowing biasing voltage adjustments on a per well basis. The scheme requires only one bias voltage distribution network, but allows for back biasing adjustments to more effectively mitigate die-to-die and within-die process variations. The biasing setting for each well is determined using a modified genetic algorithm. Our experimental results show that binning yields as low as 17% can be improved to greater than 90% after using the proposed IWABB method.","PeriodicalId":335658,"journal":{"name":"Proceedings Design, Automation and Test in Europe Conference and Exhibition","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116059352","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Simultaneous state, Vt and Tox assignment for total standby power minimization","authors":"Dongwook Lee, H. Singh, D. Blaauw, D. Sylvester","doi":"10.1109/DATE.2004.1268894","DOIUrl":"https://doi.org/10.1109/DATE.2004.1268894","url":null,"abstract":"Standby leakage current minimization is a pressing concern for mobile applications that rely on standby modes to extend battery life. Also, gate oxide leakage current (I/sub gate/) has become comparable to subthreshold leakage (I/sub sub/) in 90 nm technologies. In this paper, we propose a new method that uses a combined approach of sleep-state, threshold voltage (V/sub t/ and gate oxide thickness (T/sub ox/) assignments in a dual-V/sub t/ and dual-T/sub ox/ process to minimize both I/sub sub/ and I/sub gate/. Using this method, total leakage current can be dramatically reduced since in a known state in standby mode, only certain transistors are responsible for leakage current and need to be considered for high-V/sub t/ or thick-T/sub ox/ assignment. We formulate the optimization problem for simultaneous state, V/sub t/ and T/sub ox/ assignments under delay constraints and propose two practical heuristics. We implemented and tested the proposed methods on a set of synthesized benchmark circuits. Results show an average leakage current reduction of 5a-6X and 2-3X compared to previous approaches that only use state or state+V/sub t/ assignment, respectively, with small delay penalties.","PeriodicalId":335658,"journal":{"name":"Proceedings Design, Automation and Test in Europe Conference and Exhibition","volume":"78 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116523619","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Yoo, M. Youssef, A. Bouchhima, A. Jerraya, M. Diaz-Nava
{"title":"Multi-processor SoC design methodology using a concept of two-layer hardware-dependent software","authors":"S. Yoo, M. Youssef, A. Bouchhima, A. Jerraya, M. Diaz-Nava","doi":"10.1109/DATE.2004.1269098","DOIUrl":"https://doi.org/10.1109/DATE.2004.1269098","url":null,"abstract":"In conventional multiprocessor SoC (MPSoC) design methods, we find two problems: lack of SW code portability and lack of early SW validation. The problems cause a long design cycle. To resolve them, we present a concept of two-layer hardware-dependent software (HdS). The presented HdS consists of hardware abstraction layer to abstract the sub-system architecture and SoC abstraction layer to abstract the global MPSoC architecture. During the exploration of global and sub-system architectures, the application programming interfaces of presented two-layer HdS allow to keep the SW independent from architectural change. The simulation models of two-layer HdS enable to validate the entire system including the SW and HW design early in the design steps. We show the effectiveness of the presented methodology in the MPSoC architecture exploration of an OpenDiVX encoder system design.","PeriodicalId":335658,"journal":{"name":"Proceedings Design, Automation and Test in Europe Conference and Exhibition","volume":"345 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122297946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Low static-power frequent-value data caches","authors":"Chuanjun Zhang, Jun Yang, F. Vahid","doi":"10.1109/DATE.2004.1268851","DOIUrl":"https://doi.org/10.1109/DATE.2004.1268851","url":null,"abstract":"Static energy dissipation in cache memories will constitute an increasingly larger portion of total microprocessor energy dissipation due to nanoscale technology characteristics and the large size of on-chip caches. We propose to reduce the static energy dissipation of an on-chip data cache by taking advantage of the frequent values (FV) that widely exist in a data cache memory. The original FV-based low-power cache design aimed at only reducing dynamic power, at the cost of a 5% slowdown. We propose a better design that reduces both static and dynamic cache power, and that uses a circuit design that eliminates performance overhead. A designer can utilize our architecture by simulating an application and then synthesizing the FVs into an application-specific cache design when values will not change, or by simulating and then writing to an FV-cache with configuration registers when values could change. Furthermore, we describe hardware that can dynamically determine FVs and write to the configuration registers completely transparently. Experiments on 11 Spec 2000 benchmarks show that, in addition to the dynamic power savings, 33% static energy savings for data caches can be achieved.","PeriodicalId":335658,"journal":{"name":"Proceedings Design, Automation and Test in Europe Conference and Exhibition","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122427259","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Synthesis of partitioned shared memory architectures for energy-sufficient multi-processor SoC","authors":"Kimish Patel, E. Macii, M. Poncino","doi":"10.1109/DATE.2004.1268937","DOIUrl":"https://doi.org/10.1109/DATE.2004.1268937","url":null,"abstract":"Accesses to the shared memory in multi-processor systems-on-chip represent a significant performance bottleneck. Multi-port memories are a common solution to this problem, because they allow parallel accesses. However, they are not an energy-efficient solution. We propose an energy-efficient shared-memory architecture that can be used as a substitute for multi-port memories, which is based on an application-driven partitioning of the shared address space into a multi-bank architecture. Experiments on a set of parallel benchmarks show energy savings of about 56% with respect to a dual-port memory architecture, at a very limited performance penalty.","PeriodicalId":335658,"journal":{"name":"Proceedings Design, Automation and Test in Europe Conference and Exhibition","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129596599","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Breaking instance-independent symmetries in exact graph coloring","authors":"Arathi Ramani, F. Aloul, I. Markov, K. Sakallah","doi":"10.1613/jair.1637","DOIUrl":"https://doi.org/10.1613/jair.1637","url":null,"abstract":"Code optimization and high level synthesis can be posed as constraint satisfaction and optimization problems, such as graph coloring used in register allocation. Naturally-occurring instances of such problems are often small and can be solved optimally. A recent wave of improvements in algorithms for Boolean satisfiability (SAT) and 0-1 ILP suggests generic problem-reduction methods, rather than problem-specific heuristics, because: (1) heuristics are easily upset by new constraints; (2) heuristics tend to ignore structure; and (3) many relevant problems are provably inapproximable. The NP-spec project offers a language to specify NP-problems and automatic reductions to SAT. Problem reductions often lead to highly symmetric SAT instances, and symmetries are known to slow down SAT solvers. In this work, we compare several avenues for symmetry-breaking, in particular when certain kinds of symmetry are present in all generated instances. Our surprising conclusion is that instance-independent symmetries should often be processed together with instance-specific symmetries rather than earlier, at the specification level.","PeriodicalId":335658,"journal":{"name":"Proceedings Design, Automation and Test in Europe Conference and Exhibition","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129773399","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Workload characterization model for tasks with variable execution demand","authors":"A. Maxiaguine, S. Künzli, L. Thiele","doi":"10.1109/DATE.2004.1269030","DOIUrl":"https://doi.org/10.1109/DATE.2004.1269030","url":null,"abstract":"The analysis of real-time properties of an embedded system usually relies on the worst-case execution times (WCET) of the tasks to be executed. In contrast to that, in real world applications the running time of tasks may vary from execution to execution, e.g. in multimedia applications. The traditional worst-case analysis of the system then returns overly pessimistic estimates of the system performance. In this paper we propose a new effective method to characterize tasks with variable execution requirements, which leads to tighter worst-case bounds on system performance and better use of available resources. We show the applicability of our approach by a detailed study of a multimedia application.","PeriodicalId":335658,"journal":{"name":"Proceedings Design, Automation and Test in Europe Conference and Exhibition","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130348010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Circularscan: a scan architecture for test cost reduction","authors":"B. Arslan, A. Orailoglu","doi":"10.1109/DATE.2004.1269073","DOIUrl":"https://doi.org/10.1109/DATE.2004.1269073","url":null,"abstract":"Scan-based designs are widely used to decrease the complexity of the test generation process; nonetheless, they increase test time and volume. A new scan architecture is proposed to reduce test time and volume while retaining the original scan input count. The proposed architecture allows the use of the captured response as a template for the next pattern with only the necessary bits of the captured response being updated while observing the full captured response. The theoretical and experimental analysis promises a substantial reduction in test cost for large circuits.","PeriodicalId":335658,"journal":{"name":"Proceedings Design, Automation and Test in Europe Conference and Exhibition","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126924633","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}