Workshop on Memory System Performance and Correctness最新文献

筛选
英文 中文
Deconstructing process isolation 解构流程隔离
Workshop on Memory System Performance and Correctness Pub Date : 2006-10-22 DOI: 10.1145/1178597.1178599
Mark Aiken, Manuel Fähndrich, C. Hawblitzel, G. Hunt, J. Larus
{"title":"Deconstructing process isolation","authors":"Mark Aiken, Manuel Fähndrich, C. Hawblitzel, G. Hunt, J. Larus","doi":"10.1145/1178597.1178599","DOIUrl":"https://doi.org/10.1145/1178597.1178599","url":null,"abstract":"Most operating systems enforce process isolation through hardware protection mechanisms such as memory segmentation, page mapping, and differentiated user and kernel instructions. Singularity is a new operating system that uses software mechanisms to enforce process isolation. A software isolated process (SIP) is a process whose boundaries are established by language safety rules and enforced by static type checking. SIPs provide a low cost isolation mechanism that provides failure isolation and fast inter-process communication.To compare the performance of Singularity's SIPs against traditional isolation techniques, we implemented an optional hardware isolation mechanism. Protection domains are hardware-enforced address spaces, which can contain one or more SIPs. Domains can either run at the kernel's privilege level or be fully isolated from the kernel and run at the normal application privilege level. With protection domains, we can construct Singularity configurations that are similar to micro-kernel and monolithic kernel systems. We found that hardware-based isolation incurs non-trivial performance costs (up to 25--33%) and complicates system implementation. Software isolation has less than 5% overhead on these benchmarks.The lower run-time cost of SIPs makes their use feasible at a finer granularity than conventional processes. However, hardware isolation remains valuable as a defense-in-depth against potential failures in software isolation mechanisms. Singularity's ability to employ hardware isolation selectively enables careful balancing of the costs and benefits of each isolation technique.","PeriodicalId":130040,"journal":{"name":"Workshop on Memory System Performance and Correctness","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121961217","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 79
Efficient pattern mining on shared memory systems: implications for chip multiprocessor architectures 共享内存系统上的有效模式挖掘:对芯片多处理器架构的影响
Workshop on Memory System Performance and Correctness Pub Date : 2006-10-22 DOI: 10.1145/1178597.1178603
G. Buehrer, Yen-kuang Chen, S. Parthasarathy, A. Nguyen, A. Ghoting, Daehyun Kim
{"title":"Efficient pattern mining on shared memory systems: implications for chip multiprocessor architectures","authors":"G. Buehrer, Yen-kuang Chen, S. Parthasarathy, A. Nguyen, A. Ghoting, Daehyun Kim","doi":"10.1145/1178597.1178603","DOIUrl":"https://doi.org/10.1145/1178597.1178603","url":null,"abstract":"Frequent pattern mining is a fundamental data mining process which has practical applications ranging from market basket data analysis to web link analysis. In this work, we show that state-of-the-art frequent pattern mining applications are inefficient when executing on a shared memory multiprocessor system, due primarily to poor utilization of the memory hierarchy. To improve the efficiency of these applications, we explore memory performance improvements, task partitioning strategies, and task queuing models designed to maximize the scalability of pattern mining on SMP systems. Empirically, we show that the proposed strategies afford significantly improved performance. We also discuss implications of this work in light of recent trends in micro-architecture design, particularly chip multiprocessors (CMPs).","PeriodicalId":130040,"journal":{"name":"Workshop on Memory System Performance and Correctness","volume":"88 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128581711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Smarter garbage collection with simplifiers 使用简化器进行更智能的垃圾收集
Workshop on Memory System Performance and Correctness Pub Date : 2006-10-22 DOI: 10.1145/1178597.1178601
Melissa E. O'Neill
{"title":"Smarter garbage collection with simplifiers","authors":"Melissa E. O'Neill","doi":"10.1145/1178597.1178601","DOIUrl":"https://doi.org/10.1145/1178597.1178601","url":null,"abstract":"We introduce a method for providing lightweight daemons, called simplifiers, that attach themselves to program data. If a data item has a simplifier, the simplifier may be run automatically from time to time, seeking an opportunity to \"simplify\" the object in some way that improves the program's time or space performance.It is not uncommon for programs to improve their data structures as they traverse them, but these improvements must wait until such a traversal occurs. Simplifiers provide an alternative mechanism for making improvements that is not tied to the vagaries of normal control flow.Tracing garbage collectors can both support the simplifier abstraction and benefit from it. Because tracing collectors traverse program data structures, they can trigger simplifiers as part of the tracing process. (In fact, it is possible to view simplifiers as analogous to finalizers; whereas an object can have a finalizer that is run automatically when the object found to be dead, a simplifier can be run when the object is found to be live.)Simplifiers can aid efficient collection by simplifying objects before they are traced, thereby eliminating some data that would otherwise have been traced and saved by the collector. We present performance data to show that appropriately chosen simplifiers can lead to tangible space and speed benefits in practice.Different variations of simplifiers are possible, depending on the triggering mechanism and the synchronization policy. Some kinds of simplifier are already in use in mainstream systems in the form of ad-hoc garbage-collector extensions. For one kind of simplifier we include a complete and portable Java implementation that is less than thirty lines long.","PeriodicalId":130040,"journal":{"name":"Workshop on Memory System Performance and Correctness","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126401225","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Seven at one stroke: results from a cache-oblivious paradigm for scalable matrix algorithms 一口气七个:可扩展矩阵算法的缓存无关范式的结果
Workshop on Memory System Performance and Correctness Pub Date : 2006-10-22 DOI: 10.1145/1178597.1178604
Michael D. Adams, David S. Wise
{"title":"Seven at one stroke: results from a cache-oblivious paradigm for scalable matrix algorithms","authors":"Michael D. Adams, David S. Wise","doi":"10.1145/1178597.1178604","DOIUrl":"https://doi.org/10.1145/1178597.1178604","url":null,"abstract":"A blossoming paradigm for block-recursive matrix algorithms is presented that, at once, attains excellent performance measured by• time• TLB misses• L1 misses• L2 misses• paging to disk• scaling on distributed processors, and• portability to multiple platforms.It provides a philosophy and tools that allow the programmer to deal with the memory hierarchy invisibly, from L1 and L2 to TLB, paging, and interprocessor communication. Used together, they provide a cache-oblivious style of programming.Plots are presented to support these claims on an implementation of Cholesky factorization crafted directly from the paradigm in C with a few intrinsic calls. The results in this paper focus on low-level performance, including the new Morton-hybrid representation to take advantage of hardware and compiler optimizations. In particular, this code beats Intel's Matrix Kernel Library and matches AMD's Core Math Library, losing a bit on L1 misses while winning decisively on TLB-misses.","PeriodicalId":130040,"journal":{"name":"Workshop on Memory System Performance and Correctness","volume":"110 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122772306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
A flexible data to L2 cache mapping approach for future multicore processors 一种灵活的数据到L2缓存映射方法,用于未来的多核处理器
Workshop on Memory System Performance and Correctness Pub Date : 2006-10-22 DOI: 10.1145/1178597.1178613
Lei Jin, Hyunjin Lee, Sangyeun Cho
{"title":"A flexible data to L2 cache mapping approach for future multicore processors","authors":"Lei Jin, Hyunjin Lee, Sangyeun Cho","doi":"10.1145/1178597.1178613","DOIUrl":"https://doi.org/10.1145/1178597.1178613","url":null,"abstract":"This paper proposes and studies a distributed L2 cache management approach through page-level data to cache slice mapping in a future processor chip comprising many cores. L2 cache management is a crucial multicore processor design aspect to overcome non-uniform cache access latency for high program performance and to reduce on-chip network traffic and related power consumption. Unlike previously studied \"pure\" hardware-based private and shared cache designs, the proposed OS-microarchitecture approach allows mimicking a wide spectrum of L2 caching policies without complex hardware support. Moreover, processors and cache slices can be isolated from each other without hardware modifications, resulting in improved chip reliability characteristics. We discuss the key design issues and implementation strategies of the proposed approach, and present an experimental result showing the promise of it.","PeriodicalId":130040,"journal":{"name":"Workshop on Memory System Performance and Correctness","volume":"91 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129027127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信