{"title":"All Digital Duty Cycle Correction Circuit in 90nm Based on Mutex","authors":"S. Ramasahayam, M. Srinivas","doi":"10.1109/ISVLSI.2009.41","DOIUrl":"https://doi.org/10.1109/ISVLSI.2009.41","url":null,"abstract":"A duty cycle correction circuit (DCC) for high frequency clocks with fine resolution is designed and tested at 1.2V in 90nm CMOS process. Spice simulations show that this duty cycle corrector can adjust the output duty cycle to 50±0.5% with input clock at 500MHz and input duty cycle ranging from20% to 80%. DCC will not introduce any delay in the forward path, which makes it suitable for multi-phase clock applications. The proposed implementation uses the high frequency delay line and MUTEX (Mutual Exclusion Element) based circuit for achieving high resolution.","PeriodicalId":137508,"journal":{"name":"2009 IEEE Computer Society Annual Symposium on VLSI","volume":"54 7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128078197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Dynamic Reconfiguration of Two-Level Caches in Soft Real-Time Embedded Systems","authors":"Weixun Wang, P. Mishra","doi":"10.1109/ISVLSI.2009.22","DOIUrl":"https://doi.org/10.1109/ISVLSI.2009.22","url":null,"abstract":"Cache reconfiguration is a promising optimization technique for reducing memory hierarchy energy consumption with little or no impact on overall system performance. While cache reconfiguration is successful in desktop-based systems, it is not directly applicable in real-time systems due to timing constraints. Existing scheduling-aware cache reconfiguration techniques consider only one-level cache. It is a major challenge to dynamically tune multi-level caches since the exploration space is prohibitively large. This paper efficiently integrates cache reconfiguration in soft real-time systems with a unified two-level cache hierarchy. We utilize a set of exploration heuristics during our static analysis which effectively decreases the exploration time while keeps the generated profile results beneficial to be leveraged during runtime. Our experimental results have demonstrated 32 - 49% energy savings with minor impact on performance.","PeriodicalId":137508,"journal":{"name":"2009 IEEE Computer Society Annual Symposium on VLSI","volume":"59 7","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131579629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Inducing Thermal-Awareness in Multicore Systems Using Networks-on-Chip","authors":"David Atienza Alonso, E. Martinez","doi":"10.1109/ISVLSI.2009.25","DOIUrl":"https://doi.org/10.1109/ISVLSI.2009.25","url":null,"abstract":"Technology scaling imposes an ever increasing temperature stress on digital circuit design due to transistor density, especially on highly integrated systems, such as Multi-Processor Systems-on-Chip (MPSoCs). Therefore,temperature-aware design is mandatory and should be performed at the early design stages. In this paper we present a novel hardware infrastructure to provide thermal control of MPSoC architectures, which is based on exploiting the No interconnects of the baseline system as an active component to communicate and coordinate between temperature sensors scattered around the chip, in order to globally monitor the actual temperature. Then, a thermal management unit and clock frequency controllers adjust the frequency and voltage of the processing elements according to the temperature requirements at run-time. We show experimental results of the infrastructure to implement effective global temperature control policies for a real-life 4-core MPSoC,emulated on an FPGA-based emulation framework.","PeriodicalId":137508,"journal":{"name":"2009 IEEE Computer Society Annual Symposium on VLSI","volume":"30 21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123703470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"High Speed Parallel Architecture for Cyclic Convolution Based on FNT","authors":"Jian Zhang, Shuguo Li","doi":"10.1109/ISVLSI.2009.10","DOIUrl":"https://doi.org/10.1109/ISVLSI.2009.10","url":null,"abstract":"This paper presents a high speed parallel architecture for cyclic convolution based on Fermat Number Transform (FNT) in the diminished-1 number system. A code conversion method without addition (CCWA) and a butterfly operation method without addition (BOWA) are proposed to perform the FNT and its inverse (IFNT) except their final stages in the convolution. The pointwise multiplication in the convolution is accomplished by modulo 2n+1 partial product multipliers (MPPM) and output partial products which are inputs to the IFNT. Thus modulo 2n+1 carry propagation additions are avoided in the FNT and the IFNT except their final stages and the modulo 2n+1 multiplier. The execution delay of the parallel architecture is reduced evidently due to the decrease of modulo 2n+1 carry-propagation addition. Compared with the existing cyclic convolution architecture, the proposed one has better throughput performance and involves less hardware complexity. Synthesis results using 130nm CMOS technology demonstrate the superiority of the proposed architecture over the reported solution.","PeriodicalId":137508,"journal":{"name":"2009 IEEE Computer Society Annual Symposium on VLSI","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121190033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Synchronization-Based Abstraction Refinement for Modular Verification of Asynchronous Designs","authors":"Hao Zheng, Haiqiong Yao, T. Yoneda","doi":"10.1109/ISVLSI.2009.16","DOIUrl":"https://doi.org/10.1109/ISVLSI.2009.16","url":null,"abstract":"This paper presents a modular verification approach for asynchronous circuits to address state explosion with a novel interface refinement method to reduce false counterexamples.This method borrows the idea of parallel composition,and it iteratively refines each component in a design by examining its interface interactions, and removes the behavior not synchronized with its neighbors. This method is further enhanced by synchronizing multiple components simultaneously so that inter-dependencies among components are considered. The experiments on several large asynchronous circuits show that this method efficiently removes impossible behavior from each component including ones violating correctness requirements.","PeriodicalId":137508,"journal":{"name":"2009 IEEE Computer Society Annual Symposium on VLSI","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115248202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On-the-Fly Evaluation of FPGA-Based True Random Number Generator","authors":"R. Santoro, O. Sentieys, S. Roy","doi":"10.1109/ISVLSI.2009.33","DOIUrl":"https://doi.org/10.1109/ISVLSI.2009.33","url":null,"abstract":"Many embedded security chips require a high-quality digital True Random Number Generator (TRNG). Recently, some new TRNGs have been proposed in the literature, innovating by their new architectures. Moreover, some of them don't need to use the post-processing unit usually required in TRNG constructions. As a result, the TRNG data rate is enhanced and the produced random bits only depend on the noise source and its sampling. However, selecting a TRNG can be a delicate problem. In a hardware context (e.g. Field-Programmable Gate Array (FPGA) or Application-Specific Integrated Circuit (ASIC) implementation), the design area and power consumption are important criterions. To the best of our knowledge, no effective comparison of several TRNGs appears in the literature. This paper evaluates the randomness behavior, the area and the power consumption of the latest TRNGs. These investigations are realized into real conditions, by implementing the TRNGs into FPGA circuits.","PeriodicalId":137508,"journal":{"name":"2009 IEEE Computer Society Annual Symposium on VLSI","volume":"89 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133360546","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhimin Chen, Raghunandan Nagesh, A. Reddy, P. Schaumont
{"title":"Increasing the Sensitivity of On-Chip Digital Thermal Sensors with Pre-Filtering","authors":"Zhimin Chen, Raghunandan Nagesh, A. Reddy, P. Schaumont","doi":"10.1109/ISVLSI.2009.31","DOIUrl":"https://doi.org/10.1109/ISVLSI.2009.31","url":null,"abstract":"Thermal monitoring has been broadly used to protect high-end integrated circuits from over-heating and to identify hot-spots in complex circuits. In this paper, we present a method to increase the sensitivity of an on-chip digital thermal sensor. In contrast to the existing mechanisms that characterize the overall temperature profile on a die, our solution is able to detect the submerged thermal variation caused by specific predefined events (SPE), under the precondition that the SPE’s dominant frequency does not overlap with those of other thermal events. This is made possible by pre-filtering of the temperature value. A demonstrator is implemented in an ordinary FPGA, in which the SPE is a person’s finger touching on the FPGA package. We successfully show that our design can do a correct and reliable detection of the finger touching event while ignoring other larger variations caused by other reasons. Because the finger touching event has no other special characteristics except for its unique frequency, we conclude that our solution is also applicable to other SPEs, especially low-frequency ones. In general, our method is sensitive, reliable and also flexible.","PeriodicalId":137508,"journal":{"name":"2009 IEEE Computer Society Annual Symposium on VLSI","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132563748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alexandre K. I. Mendonça, D. Volpato, José Luís Almada Güntzel, L. Santos
{"title":"Mapping Data and Code into Scratchpads from Relocatable Binaries","authors":"Alexandre K. I. Mendonça, D. Volpato, José Luís Almada Güntzel, L. Santos","doi":"10.1109/ISVLSI.2009.28","DOIUrl":"https://doi.org/10.1109/ISVLSI.2009.28","url":null,"abstract":"Scratchpad memories (SPMs) are promising for energy-efficient embedded systems. Most optimizing techniques for mapping data and code elements to SPMs assume the availability of source code. However, embedded software development has to cope with legacy code, third-party software, and IP-protected applications for which only the binaries are available. The few techniques that directly handle binaries operate on executable files and are limited to either code or data. This work proposes a new technique that addresses both data and code allocation into SPMs. Since it operates directly on binaries, the technique allows library elements to be eligible for SPM mapping. It consists of three main engines: a profiler, a mapper and a patcher. The patcher was designed to operate upon relocatable object binaries so as to overcome the inefficiency of bookkeeping SPM relocations on executable binaries. As compared to code-only SPM mapping, an average energy saving of 15% was obtained for a varied set of benchmark programs and memory configurations. Savings around 47% were reached for the two programs with higher static data content. The average patching time was 0.23s on a quad-core workstation.","PeriodicalId":137508,"journal":{"name":"2009 IEEE Computer Society Annual Symposium on VLSI","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123484776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"High-Speed Low-Current Duobinary Signaling Over Active Terminated Chip-to-Chip Interconnect","authors":"V. Pasupureddi, P. Mandal, Sunil Sachdev","doi":"10.1109/ISVLSI.2009.9","DOIUrl":"https://doi.org/10.1109/ISVLSI.2009.9","url":null,"abstract":"In this work we propose high-speed low-current duobinary signaling scheme over an active terminated chip-to-chip interconnect. The active termination scheme eliminates the need of any dedicated passive terminator both at the transmitter and receiver, avoiding signal reflection. Elimination of the passive terminator helps to reduce the transmitted signal level without effecting signal detect-ability of the receiver and also removes the thermal noise of the terminator. To implement bandwidth efficient duobinary signaling, we present a current-mode high-speed precoder operating at 10-Gb/s. A low-current active terminated driver based on modified Cherry-Hooper topology is proposed. At the receive-end, we propose an active terminated current-mode receiver(Rx) with regulated gate cascode (RGC) based transimpedance amplifier(TIA). Folded active inductor peaking is used to enhance the bandwidth of this TIA. We also propose lowpower broadband equalizer topology for channel equalization. The duobinary transmitter and receiver circuits are implemented in 1.8-V, 0.18-μm Digital CMOS technology with an f_T of 27-GHz. The designed high speed duobinary Tx/Rx circuits work up-to 8-Gb/s speed while transmitting the data over FR4 PCB trace of length 29.5-inch and for the targeted bit-error-rate(BER) of 10^−12. The power consumed in the transmitter and receiver circuits is 42.9-mW at 8-Gb/s","PeriodicalId":137508,"journal":{"name":"2009 IEEE Computer Society Annual Symposium on VLSI","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126483913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An 8-bit 1.8 V 500 MSPS CMOS Segmented Current Steering DAC","authors":"Santanu Sarkar, S. Banerjee","doi":"10.1109/ISVLSI.2009.12","DOIUrl":"https://doi.org/10.1109/ISVLSI.2009.12","url":null,"abstract":"This paper presents design of an 8-bit 1.8 V segmented current steering (CS) digital-to-analog converter (DAC)using 0.18 μm double poly five metal CMOS technology. The DAC has been segmented as 6+2 to achieve optimum performance for minimum area. The simulation result shows a maximum DNLof 0.30 LSB and an INL of 0.33 LSB. The midcode glitch is0.27 pV s. The simulated SNDR and SFDR of the segmented DAC are 52.13 dB and 44.83 dB respectively. The settling of the segmented DAC is 6.02 ns. The power consumption is simulated as 7.88 mW. The prototype will be used in telecommunication applications.","PeriodicalId":137508,"journal":{"name":"2009 IEEE Computer Society Annual Symposium on VLSI","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128182074","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}