M. Hanawa, T. Nishimukai, O. Nishii, Masatoshi Suzuki, K. Yano, M. Hiraki, S. Shukuri, T. Nishida
{"title":"On-chip multiple superscalar processors with secondary cache memories","authors":"M. Hanawa, T. Nishimukai, O. Nishii, Masatoshi Suzuki, K. Yano, M. Hiraki, S. Shukuri, T. Nishida","doi":"10.1109/ICCD.1991.139862","DOIUrl":"https://doi.org/10.1109/ICCD.1991.139862","url":null,"abstract":"The development of an experimental high-performance microprocessor chip based on a 0.3- mu m BiCMOS technology is discussed. It is designed to operate at a 250-MHz clock rate. It includes two processors, each of which executes two instructions in parallel. The chip performs 1000 MIPS when instructions and data are fetched from primary caches. It also includes a four-wave interleaved secondary cache assessed in parallel according to a split-bus protocol, to reduce shared memory conflicts. The VLSI architecture and design results of this chip are described.<<ETX>>","PeriodicalId":239827,"journal":{"name":"[1991 Proceedings] IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"93 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132773338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"IBM ES/9000 system architecture and hardware","authors":"W. J. Nohilly, V. Lund","doi":"10.1109/ICCD.1991.139968","DOIUrl":"https://doi.org/10.1109/ICCD.1991.139968","url":null,"abstract":"A description is given how IBM's Enterprise Systems requirements for data management are implemented in the ES/9000 process series. The ES/9000 processor family is the next step in efficient data movement/management for data that reside anywhere in the Enterprise. To manage this vast amount of data, certain high end ES/9000 models utilize four levels of memory (L1 or first level cache, L2 or second level buffer, L3 or main store and L4 or expanded store). Design trade-offs were made to improve processor availability. The processors have been designed to achieve fault tolerant function in power, processor and memory arrays. For example, if one of the power supplies fails, others increase their output to provide full power to the system. Extensive use is made of error detection/correction codes throughout the processor complex. Expanded storage incorporates a sophisticated error correction capability. All double bit errors are corrected, and triple bit errors, are detected.<<ETX>>","PeriodicalId":239827,"journal":{"name":"[1991 Proceedings] IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133431391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
R. Belanger, David P. Conrady, P. S. Honsinger, T. Lavery, S. J. Rothman, Erich C. Schanzenbach, D. Sitaram, C. Selinger, R. E. DuBois, G. W. Mahoney, G. Miceli
{"title":"Enhanced chip/package design for the IBM ES/9000","authors":"R. Belanger, David P. Conrady, P. S. Honsinger, T. Lavery, S. J. Rothman, Erich C. Schanzenbach, D. Sitaram, C. Selinger, R. E. DuBois, G. W. Mahoney, G. Miceli","doi":"10.1109/ICCD.1991.139969","DOIUrl":"https://doi.org/10.1109/ICCD.1991.139969","url":null,"abstract":"The automatic placement and wiring programs used for design of the gate array bipolar chips, the TCM logical design optimization for timing, and the automated module wiring programs of the ES/9000 machines are described. An overview of related aspects of the chip and module technologies is given.<<ETX>>","PeriodicalId":239827,"journal":{"name":"[1991 Proceedings] IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133127844","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A mechanism for efficient context switching","authors":"P. Nuth, W. Dally","doi":"10.1109/ICCD.1991.139903","DOIUrl":"https://doi.org/10.1109/ICCD.1991.139903","url":null,"abstract":"Context switches are slow in conventional processors because the entire processor state must be saved and restored, even if much of the restored state is not used before the next context switch. This unnecessary data movement is required because of the coarse granularity of binding between names and registers. The context cache is introduced, which binds variable names to individual registers. This allows context switches to be very inexpensive, since registers are only loaded and saved out as needed. Analysis shows that the context cache holds more live data than a multithreaded register file, and supports more tasks without spilling to memory. Circuit simulations show that the access time of a context cache is 7% greater than a conventional register file of the same size.<<ETX>>","PeriodicalId":239827,"journal":{"name":"[1991 Proceedings] IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"126 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116091490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Three-level decomposition with application to PLDs","authors":"A. A. Malik, D. Harrison, R. Brayton","doi":"10.1109/ICCD.1991.139989","DOIUrl":"https://doi.org/10.1109/ICCD.1991.139989","url":null,"abstract":"A scheme for programmable logic array (PLA) decomposition that consists of one level of PLAs followed by a second level of simple two-input logic gates is presented. The propagation delay is therefore the sum of the delay through one level of PLA and one level of two-input gates. Since the delay through a two-input gate is significantly less than that through a PLA, the timing performance of the new scheme is generally superior to those of earlier PLA decomposition schemes. The sizes of the PLAs used depend on the choice of the two-input gates. An algorithm is presented that chooses the functionality of the gates such that the areas of the first-level PLAs are minimized, further improving performance. The new decomposition scheme was developed for the automatic programming of a programmable logic device (PLD) which had basically a three level architecture. The functional unit for such a PLD is described and the application of the algorithm to the programming of these functional units is discussed. Experimental results show that the new scheme significantly reduces the area over the single PLA implementation.<<ETX>>","PeriodicalId":239827,"journal":{"name":"[1991 Proceedings] IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"105 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116818693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Reduced Hamming count and its aliasing probability","authors":"A. Gleason, W. Jone","doi":"10.1109/ICCD.1991.139918","DOIUrl":"https://doi.org/10.1109/ICCD.1991.139918","url":null,"abstract":"Hardware overhead reduction through counter selection is considered for the Hamming count compaction test. A method to choose the most effective syndrome and input variable counter pair is given. Both simulation and theoretical analysis illustrate that this method produces an optimal pairing. The aliasing probability of this two-counter test is developed and shown to reduce the exhaustive ones count aliasing probability by half an order.<<ETX>>","PeriodicalId":239827,"journal":{"name":"[1991 Proceedings] IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122235021","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Redundancy identification and removal based on implicit state enumeration","authors":"Hyunwoo Cho, G. Hachtel, F. Somenzi","doi":"10.1109/ICCD.1991.139849","DOIUrl":"https://doi.org/10.1109/ICCD.1991.139849","url":null,"abstract":"The knowledge of the state transition graph (STG) of a sequential circuit helps in generating test sequences and identifying redundancies. The application of algorithms to the identification and removal of redundancies is reported. This strategy is based on traversing the STG of the given circuit and then performing redundancy identification using the reachability information calculated by the traversal. This method considers one candidate redundancy at a time, in an order that tries to minimize the total processing time. Substantial area and delay reductions are achieved. Experiments show that for many circuits 100% of the sequentially redundant faults can be eliminated in very reasonable amounts of time.<<ETX>>","PeriodicalId":239827,"journal":{"name":"[1991 Proceedings] IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125746443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Logic synthesis of 100-percent testable logic networks","authors":"G. Tromp, A. V. Goor","doi":"10.1109/ICCD.1991.139937","DOIUrl":"https://doi.org/10.1109/ICCD.1991.139937","url":null,"abstract":"An approach is presented for the synthesis of 100% testable logic networks based on a test pattern generation system for the identification of redundant faults. A redundancy removal procedure for the elimination of redundant nodes and gates from the network is also presented. Elimination of redundancy is an important task in a logic synthesis system that aims at the synthesis of 100% testable logic networks. Logic synthesis algorithms tend to generate a large number of redundancies, most of which can be easily identified, but some of these redundancies are very hard to identify by logic minimization procedures as well as by conventional test pattern generation algorithms.<<ETX>>","PeriodicalId":239827,"journal":{"name":"[1991 Proceedings] IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123466531","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"I/O pad assignment based on the circuit structure","authors":"Massoud Pedram, K. Chaudhary, E. Kuh","doi":"10.1109/ICCD.1991.139906","DOIUrl":"https://doi.org/10.1109/ICCD.1991.139906","url":null,"abstract":"An algorithm is presented for assigning off-chip I/O pads for a logic circuit. The technique, which is based on the analysis of the circuit structure and path delay constraints, uses linear placement, goal-programming, linear-sum assignment and I/O pad clustering to assign locations to I/O pads. The I/O pad assignment is then used by placement tools. Experimental data show that as a result of using the I/O pad assignment procedure, the total interconnection length and circuit delay (after placement and routing) are reduced by 8-15% and 3-4%, respectively. This technique is general and can handle I/O pad assignment prior to logic synthesis or detailed placement procedures.<<ETX>>","PeriodicalId":239827,"journal":{"name":"[1991 Proceedings] IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"98 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121972492","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Power-down structures for BIST","authors":"P. S. Levy","doi":"10.1109/ICCD.1991.139895","DOIUrl":"https://doi.org/10.1109/ICCD.1991.139895","url":null,"abstract":"The author discusses how power-down test structures use a separate test power supply to allow the associated test circuitry to power-down when not being tested. These structures are specially designed to self-isolate from the host circuit when power is removed from Tvdd. This action will increase reliability by removing auxiliary circuit elements used for test from the host design during normal operation and decrease power consumption.<<ETX>>","PeriodicalId":239827,"journal":{"name":"[1991 Proceedings] IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"160 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122310225","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}