Proceedings 2000 International Conference on Computer Design最新文献_第5页

Buffer library selection 缓冲区库选择

Proceedings 2000 International Conference on Computer Design Pub Date : 2000-09-17 DOI: 10.1109/ICCD.2000.878289

J. Neves, Stephen T. Quay

引用次数: 22

A selective temporal and aggressive spatial cache system based on time interval 一种基于时间间隔的选择性时间主动空间缓存系统

Proceedings 2000 International Conference on Computer Design Pub Date : 2000-09-17 DOI: 10.1109/ICCD.2000.878298

Jung-Hoon Lee, Jang-Soo Lee, Shin-Dug Kim

引用次数: 12

An evaluation of move-based multi-way partitioning algorithms 基于移动的多路划分算法的评价

Proceedings 2000 International Conference on Computer Design Pub Date : 2000-09-17 DOI: 10.1109/ICCD.2000.878309

Elie Yarack, J. Carletta

引用次数: 8

AMULET3: a 100 MIPS asynchronous embedded processor AMULET3: 100 MIPS异步嵌入式处理器

Proceedings 2000 International Conference on Computer Design Pub Date : 2000-09-17 DOI: 10.1109/ICCD.2000.878304

S. Furber, D. A. Edwards, J. Garside

引用次数: 78

Advanced wiring RC timing design techniques for logic LSIs in gigahertz era and beyond 千兆赫时代及以后逻辑lsi的先进布线RC时序设计技术

Proceedings 2000 International Conference on Computer Design Pub Date : 2000-09-17 DOI: 10.1109/ICCD.2000.878340

T. Hiyama, Yuko Ito, S. Isomura, Kazunobu Nojiri, Eijiro Maeda

引用次数: 0

Static timing analysis with false paths 带假路径的静态定时分析

Proceedings 2000 International Conference on Computer Design Pub Date : 2000-09-17 DOI: 10.1109/ICCD.2000.878336

Haizhou Chen, B. Lu, D. Du

引用次数: 5

Analysis of shared memory misses and reference patterns 分析共享内存缺失和引用模式

Proceedings 2000 International Conference on Computer Design Pub Date : 2000-09-17 DOI: 10.1109/ICCD.2000.878285

J. Rothman, A. Smith

{"title":"Analysis of shared memory misses and reference patterns","authors":"J. Rothman, A. Smith","doi":"10.1109/ICCD.2000.878285","DOIUrl":"https://doi.org/10.1109/ICCD.2000.878285","url":null,"abstract":"Shared bus computer systems permit the relatively simple and efficient implementation of cache consistency algorithms, but the shared bus is a bottleneck which limits performance. False sharing can be an important source of unnecessary traffic for invalidation-based protocols, elimination of which can provide significant performance improvements. For many multiprocessor workloads, however, most misses are true sharing plus cold start misses. Regardless of the cause of cache misses, the largest fraction of bus traffic are words transferred between caches without being accessed, which we refer to as dead sharing. We establish here new methods for characterizing cache block reference patterns, and we measure how these patterns change with variation in workload and block size. Our results show that 42 percent of 64-byte cache blocks are invalidated before more than one word has been read from the block and that 58 percent of blocks that have been modified only have a single word modified before an invalidation to the block occurs. Approximately 50 percent of blocks written and subsequently read by other caches show no use of the newly written information before the block is again invalidated. In addition to our general analysis of reference patterns, we also present a detailed analysis of dead sharing for each shared memory multiprocessor program studied. We find that the worst 10 blocks (based on most total misses) from each of our traces contribute almost 50 percent of the false shearing misses and almost 20 percent of the true sharing misses (on average). A relatively simple restructuring of four of our workloads based on analysis of these 10 worst blocks leads to a 21 percent reduction in overall misses and a 15 percent reduction in execution time. Permitting the block size to vary (as could be accomplished with a sector cache) shows that bus traffic can be reduced by 88 percent (for 64-byte blocks) while also decreasing the miss ratio by 35 percent.","PeriodicalId":437697,"journal":{"name":"Proceedings 2000 International Conference on Computer Design","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125095464","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Delay constrained optimization by simultaneous fanout tree construction, buffer insertion/sizing and gate sizing 同时扇出树构建、缓冲区插入/大小和门大小的延迟约束优化

Proceedings 2000 International Conference on Computer Design Pub Date : 2000-09-17 DOI: 10.1109/ICCD.2000.878287

I-Min Liu, A. Aziz

引用次数: 0

Unified fine-granularity buffering of index and data: approach and implementation 索引与数据的统一细粒度缓冲:方法与实现

Proceedings 2000 International Conference on Computer Design Pub Date : 2000-09-17 DOI: 10.1109/ICCD.2000.878284

Q. Cao, J. Torrellas, H. Jagadish

{"title":"Unified fine-granularity buffering of index and data: approach and implementation","authors":"Q. Cao, J. Torrellas, H. Jagadish","doi":"10.1109/ICCD.2000.878284","DOIUrl":"https://doi.org/10.1109/ICCD.2000.878284","url":null,"abstract":"Disk I/O is recognized as a major performance bottleneck in many database applications. Consequently, a topic of considerable study in database systems has traditionally been buffer management. Recently, disk pages have been increasing in size, enabling more and more data to fit in a single page. Such a trend suggests that buffering the data at a grain size finer than a page may use memory better. As a result, there has been some interest in fine-granularity buffering. Past approaches to fine-granularity buffering have proposed buffering either data tuples alone or index entries alone. In this paper, we propose a scheme to support fine-granularity buffering of both index and data entries in a unified manner. The scheme, which we call Hot-Entry buffering, can be used in combination with conventional page-level buffering. Through the experimental evaluation of a simple system, we demonstrate the benefits of our scheme over conventional page-level buffering, and over index-only and data-only fine-granularity buffering. In particular, we show that, for a range of parameter values, our buffering scheme speeds-up query execution by 20-60% relative to page-level buffering only, and by 10-20% relative to the best of index-only or data-only fine-granularity buffering.","PeriodicalId":437697,"journal":{"name":"Proceedings 2000 International Conference on Computer Design","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122016111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

An advanced instruction folding mechanism for a stackless Java processor 无堆栈Java处理器的高级指令折叠机制

Proceedings 2000 International Conference on Computer Design Pub Date : 2000-09-17 DOI: 10.1109/ICCD.2000.878343

A. Kim, J. M. Chang

引用次数: 4