{"title":"On-chip clock error characterization for clock distribution system","authors":"Chuan Shan, D. Galayko, F. Anceau","doi":"10.1109/ISVLSI.2013.6654630","DOIUrl":"https://doi.org/10.1109/ISVLSI.2013.6654630","url":null,"abstract":"In this paper, we investigate a test strategy for characterization of clock error statistics between two clock domains in high-speed clocking systems (gigahertz and more). The method allows an indirect measurement (not based on time interval measurement) of clock error distribution by observing the integrity of a periodic sequence transmitted between two clocking domains. The method is compatible with fully on-chip implementation, and the readout of result to off-chip signals is cadenced at low rate. The strategy aims at picoseconds resolution without complex calibration. The idea was first validated by a discrete prototype at downscaled frequencies, and then a high frequency on-chip prototype was designed using 65 nm CMOS technology. Simulation results predict a measurement precision of less than ±2.5 ps. The article presents the theory, exposes the hardware implementation, and reports the experimental validation and simulation results of two prototypes.","PeriodicalId":439122,"journal":{"name":"2013 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128267458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Brahim Al Farisi, Elias Vansteenkiste, Karel Bruneel, D. Stroobandt
{"title":"A novel tool flow for increased routing configuration similarity in multi-mode circuits","authors":"Brahim Al Farisi, Elias Vansteenkiste, Karel Bruneel, D. Stroobandt","doi":"10.1109/ISVLSI.2013.6654629","DOIUrl":"https://doi.org/10.1109/ISVLSI.2013.6654629","url":null,"abstract":"A multi-mode circuit implements the functionality of a limited number of circuits, called modes, of which at any given time only one needs to be realised. Using run-time reconfiguration (RTR) of an FPGA, all the modes can be time-multiplexed on the same reconfigurable region, requiring only an area that can contain the biggest mode. Typically, conventional run-time reconfiguration techniques generate a configuration of the reconfigurable region for every mode separately. This results in configurations that are bit-wise very different. Thus, in this case, many bits need to be changed in the configuration memory to switch between modes, leading to long reconfiguration times. In this paper we present a novel tool flow that retains the placement of the conventional RTR flow, but uses TRoute, a reconfiguration-aware connection router, to implement the connections of all modes simultaneously. DRoute stimulates the sharing of routing resources between connections of different modes. This results in a significant increase in the similarity between the routing configurations of the modes. In the experimental results it is shown that the number of routing configuration bits that needs to be rewritten is reduced with a factor between 2 and 4 compared to conventional techniques.","PeriodicalId":439122,"journal":{"name":"2013 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124542291","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Recent advances and challenges in physical design automation","authors":"M. Johann","doi":"10.1109/ISVLSI.2013.6654613","DOIUrl":"https://doi.org/10.1109/ISVLSI.2013.6654613","url":null,"abstract":"Summary form only given. A key factor that enabled the development of the microelectronics industry was the creation of sophisticated software tools for design automation. The tremendous evolution of manufacturing capacity was primarily fuelled by scaling down the transistors. Thus, besides being possible to integrate an increasing number of devices, they have also become increasingly faster. However, this wonderful manufacturing capacity would be of little use if it were not possible for human designers to specify the different functions, structures and physical characteristics of highly complex projects in a progressively more efficient way. That is why the most traditional design problems such as placement, routing, gate sizing, are still hot topics today. In the last years, all these subjects regained a lot of attention and experimented significant, sometimes radical, advancements. In this talk we review basic concepts at the definition of such problems, which in general require combinatory optimization, and cover some of the most important achievements in placement and routing in the last decade. They were made possible by the combined effort of industry and academia with the release of realistic benchmark sets and the promotion of several research contests. Finally, we can better understand some of the current challenges that must be faced to keep pace with the next big designs, highlighting the relevance of research in this area.","PeriodicalId":439122,"journal":{"name":"2013 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"11 8","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132939391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Guilherme M. Castilhos, Marcelo G. Mandelli, G. Madalozzo, F. Moraes
{"title":"Distributed resource management in NoC-based MPSoCs with dynamic cluster sizes","authors":"Guilherme M. Castilhos, Marcelo G. Mandelli, G. Madalozzo, F. Moraes","doi":"10.1109/ISVLSI.2013.6654651","DOIUrl":"https://doi.org/10.1109/ISVLSI.2013.6654651","url":null,"abstract":"Scalability is an important issue in large MPSoCs. MPSoCs may execute several applications in parallel, with dynamic workload, and tight QoS constraints. Thus, the MPSoC management must be distributed to cope with such constraints. This paper presents a distributed resource management in NoC-Based MPSoC, using a clustering method, enabling the modification of the cluster size at runtime. This work addresses the following distributed techniques: task mapping, monitoring and task migration. Results show an important reduction in the total execution time of applications, reduced number of hops between tasks (smaller communication energy), and a reclustering method through monitoring and task migration.","PeriodicalId":439122,"journal":{"name":"2013 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133070875","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
L. Tambara, F. Kastensmidt, P. Rech, T. Balen, M. Lubaszewski
{"title":"Neutron-induced single event effects analysis in a SAR-ADC architecture embedded in a mixed-signal SoC","authors":"L. Tambara, F. Kastensmidt, P. Rech, T. Balen, M. Lubaszewski","doi":"10.1109/ISVLSI.2013.6654657","DOIUrl":"https://doi.org/10.1109/ISVLSI.2013.6654657","url":null,"abstract":"This paper describes a neutron-induced single event effect test in analog-to-digital converters of a Microsemi's programmable commercial mixed-signal system-on-chip. The main objective is to investigate the reliability of the charge redistribution successive approximation register architecture of the analog-to-digital converters (SAR-ADC) embedded into this device, considering critical application projects. The case-study circuit is a data acquisition system that uses the two available analog-to-digital converters (ADCs), being one converter controlled by the embedded processor and the other by the digital programmable matrix of the device. This scheme is based on a design diversity redundancy concept. The setup was exposed to a neutron source at the CCLRC Rutherford Appleton Laboratory - ISIS in order to investigate the occurrence of SEEs ranging from single to errors bursts. Also, SPICE simulations were carried out in a charge redistribution SAR-ADC architecture in order to clarify the results obtained from this experiment.","PeriodicalId":439122,"journal":{"name":"2013 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114831377","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Design of standard-cell libraries for asynchronous circuits with the ASCEnD flow","authors":"Matheus T. Moreira, Ney Laert Vilar Calazans","doi":"10.1109/ISVLSI.2013.6654647","DOIUrl":"https://doi.org/10.1109/ISVLSI.2013.6654647","url":null,"abstract":"This work presents the ASCEnD flow, a design flow devised for the design of components required for the design of asynchronous systems using standard-cells. The flow is fully automated except for the layout generation step, and can be parameterized for any CMOS technology. It was employed in the design of a dedicated standard-cell library, which contains over five hundred components, for the STMicroelectronics 65 nm technology. This library supported implementation of different circuits, like network-on-chip routers and cryptographic cores.","PeriodicalId":439122,"journal":{"name":"2013 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125840833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Branch-and-bound style resource constrained scheduling using efficient structure-aware pruning","authors":"Mingsong Chen, Saijie Huang, G. Pu, P. Mishra","doi":"10.1109/ISVLSI.2013.6654637","DOIUrl":"https://doi.org/10.1109/ISVLSI.2013.6654637","url":null,"abstract":"Branch-and-bound approaches are promising in pruning infeasible search space during the resource constrained scheduling (RCS). However, such methods only compare the estimated upper and lower bounds of an incomplete schedule to the length of the best feasible schedule at that iteration. This paper proposes an efficient pruning technique which can identify the fruitless search space based on the detailed structural scheduling information of the obtained best feasible schedule. The proactive nature of our pruning technique enables the pruning of the space which cannot be identified by the state-of-the-art branch-and-bound techniques. The experimental results demonstrate that our approach can drastically (up to two orders-of-magnitude) reduce the overall RCS time under a wide variety of resource constraints.","PeriodicalId":439122,"journal":{"name":"2013 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124756962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Carsten Tradowsky, T. Harbaum, Shaver Deyerle, J. Becker
{"title":"LImbiC: An adaptable architecture description language model for developing an application-specific image processor","authors":"Carsten Tradowsky, T. Harbaum, Shaver Deyerle, J. Becker","doi":"10.1109/ISVLSI.2013.6654619","DOIUrl":"https://doi.org/10.1109/ISVLSI.2013.6654619","url":null,"abstract":"Due to their ease of integration and widespread adoption, General Purpose Processors (GPP) are presently used in a wide range of applications. However, the highly flexible nature of a GPP leads to overhead in terms of power, performance and area for a specific application. Another approach, proposed by this paper, is to use Application-Specific Instruction-set Processors (ASIP) that are specifically adapted to a given application. To decrease development time and effort and consequently time-to-market, a model-based development process is used. The high-level model allows for automated generation of software development tools, simulation models and RTL models from a single source. An adaptable LISA model representing a simplified ARM Cortex-M1 processor is used as a base, which is then supplemented by application-specific features requested by the software developer or system architect. This paper presents a working example of this concept, in which a state-of-the-art processor model, we call LImbiC, is extended to meet the requirements of a specific application. Specifically, custom instructions are added to the LImbiC processor to improve its performance in the particular task of image processing. In addition, during the development process infrequently used or obsoleted instructions can be removed, which allows for separate versions of LImbiC to meet varying design goals within the design space exploration.","PeriodicalId":439122,"journal":{"name":"2013 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116228492","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A discussion on SRAM forward/inverse problem analyses for RTN long-tail distributions","authors":"Worawit Somha, H. Yamauchi, Ma YuYu","doi":"10.1109/ISVLSI.2013.6654623","DOIUrl":"https://doi.org/10.1109/ISVLSI.2013.6654623","url":null,"abstract":"This paper discusses, for the first time, how the statistical SRAM design analyses should be changed when: (1) the shift-amount of the time-dependent (TD) voltage margin variations (MV) after the screening test will become larger than that before and (2) the shapes of the MV distribution will change from the Gaussian to the complex mixtures of Gamma distributions. We discuss on the SRAM TD-MV analyses with not only the forward problem but also the inverse problem, i.e., deconvolution analyses. The proposed algorithm for the deconvolution to circumvent the issues caused by high-pass filtering behavior is discussed. Based on the proposed convolution /deconvolution design analyses, it has been shown for the first time that: (1) detecting the truncating point of the distributions of TD-MV by the screening test and (2) predicting the required the MV-shift-amount by the assisted circuit schemes to avoid the out of specs in the market during the life-time, etc, has become enabled based on the target specification.","PeriodicalId":439122,"journal":{"name":"2013 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123839929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Srinivasan, Rance Rodrigues, A. Annamalai, I. Koren, S. Kundu
{"title":"A study on polymorphing superscalar processor dynamically to improve power efficiency","authors":"S. Srinivasan, Rance Rodrigues, A. Annamalai, I. Koren, S. Kundu","doi":"10.1109/ISVLSI.2013.6654621","DOIUrl":"https://doi.org/10.1109/ISVLSI.2013.6654621","url":null,"abstract":"Asymmetric Multicore Processors (AMP) have emerged as likely candidates to solve the performance/power conundrum in the current generation of processors. Most recent work in this area evaluate such multicores by considering large (usually out-of-order (OOO)) and small (usually in-order (InO)) cores on the same chip. Dynamic online swapping of threads between these cores is then facilitated whenever deemed beneficial. However, if threads are swapped too often, the overheads may negatively impact the benefits of swapping. Hence, in most recent work, thread swapping decisions are made at coarse grain instruction granularities, leaving out many opportunities. In this paper, we propose a scheme to mitigate the penalty imposed by thread swapping and yet achieve all the benefits of AMPs. Here, a single superscalar OOO core morphs itself into an InO core at runtime, whenever determined to be performance/Watt efficient. Certain Intel processors already have a similar mechanism to statically morph an OOO core to an InO core to facilitate debug. We extend this existing capability to perform dynamic core morphing at runtime with an orthogonal objective of improving power efficiency. Results indicate that on an average, performance/Watt benefits of 10% can be extracted by our proposed morphing scheme at a very small performance penalty of 3.8%. Since this scheme is based on existing mechanisms readily available in current microprocessors, it incurs no hardware overheads.","PeriodicalId":439122,"journal":{"name":"2013 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131208116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}