ACM/IEEE SC 1997 Conference (SC'97)最新文献

筛选
英文 中文
Page Replacement Using Marginal Loss Functions 使用边际损失函数替换页面
ACM/IEEE SC 1997 Conference (SC'97) Pub Date : 1997-11-15 DOI: 10.1145/509593.509643
M. Ujaldón, Shamik D. Sharma, J. Saltz
{"title":"Page Replacement Using Marginal Loss Functions","authors":"M. Ujaldón, Shamik D. Sharma, J. Saltz","doi":"10.1145/509593.509643","DOIUrl":"https://doi.org/10.1145/509593.509643","url":null,"abstract":"We describe a compiler-directed technique to reduce page-faults in multiprocessing systems. Compile-time analysis of access-patterns is coupled with runtime support to characterize access-patterns in the form of marginal-loss functions - these functions describe the extra page faults that would be incurred for an access-pattern if it were given one fewer physical page. The kernel uses these functions to guide its page-replacement decisions by victimizing those processes whose access-patterns are affected the least. We outline how marginal loss functions can be computed for common access-patterns and presents simulation results to demonstrate the technique's effectiveness.","PeriodicalId":315276,"journal":{"name":"ACM/IEEE SC 1997 Conference (SC'97)","volume":"132 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127373403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
FM-QoS: Real-time Communication using Self-synchronizing Schedules FM-QoS:使用自同步调度的实时通信
ACM/IEEE SC 1997 Conference (SC'97) Pub Date : 1997-11-15 DOI: 10.1145/509593.509595
Kay Connelly, A. Chien
{"title":"FM-QoS: Real-time Communication using Self-synchronizing Schedules","authors":"Kay Connelly, A. Chien","doi":"10.1145/509593.509595","DOIUrl":"https://doi.org/10.1145/509593.509595","url":null,"abstract":"FM-QoS employs a novel communication architecture based on network feedback to provide predictable communication performance (e.g. deterministic latencies and guaranteed bandwidths) for high speed cluster interconnects. Network feedback is combined with self-synchronizing communication schedules to achieve synchrony in the network interfaces (NIs). Based on this synchrony, the network can be scheduled to provide predictable performance without special network QoS hardware. We describe the key element of the FM-QoS approach, feedback-based synchronization (FBS), which exploits network feedback to synchronize senders. We use Petri nets to characterize the set of self-synchronizing communication schedules for which FBS is effective and to describe the resulting synchronization overhead as a function of the clock drift across the network nodes. Analytic modeling suggests that for clocks of quality 300 ppm (such as found in the Myrinet NI), a synchronization overhead less than 1% of the total communication traffic is achievable -- significantly better than previous software-based schemes and comparable to hardware-intensive approaches such as virtual circuits (e.g. ATM). We have built a prototype of FBS for Myricom s Myrinet network (a 1.28 Gbps cluster network) which demonstrates the viability of the approach by sharing network resources with predictable performance. The prototype, which implements the local node schedule in software, achieves predictable latencies of 23 µs for a single-switch, 8-node network and 2 KB packets. In comparison, the best-effort scheme achieves 104 µs for the same network without FBS. While this ratio of over four to one already demonstrates the viability of the approach, it includes nearly 10 µs of overhead due to the software implementation. For hardware implementations of local node scheduling, and for networks with cascaded switches, these ratios should be much larger factors.","PeriodicalId":315276,"journal":{"name":"ACM/IEEE SC 1997 Conference (SC'97)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126745328","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 25
Implementing a Performance Forecasting System for Metacomputing The Network Weather Service 基于元计算的网络气象服务性能预报系统的实现
ACM/IEEE SC 1997 Conference (SC'97) Pub Date : 1997-11-15 DOI: 10.1145/509593.509600
R. Wolski, N. Spring, C. Peterson
{"title":"Implementing a Performance Forecasting System for Metacomputing The Network Weather Service","authors":"R. Wolski, N. Spring, C. Peterson","doi":"10.1145/509593.509600","DOIUrl":"https://doi.org/10.1145/509593.509600","url":null,"abstract":"In this paper we describe the design and implementation of a system called the Network Weather Service (NWS) that takes periodic measurements of deliverable resource performance from distributed networked resources, and uses numerical models to dynamically generate forecasts of future performance levels. These performance forecasts, along with measures of performance fluctuation (e.g the mean square prediction error) and forecast lifetime that the NWS generates, are made available to schedulers and other resource management mechanisms at runtime so that they may determine the quality-of-service that will be available from each resource. We describe the architecture of the NWS and implementations that we have developed and are currently deploying for the Legion [13] and Globus/Nexus [7] metacomputing infrastructures. We also detail NWS forecasts of resource performance using both the Legion and Globus/Nexus implementations. Our results show that simple forecasting techniques substantially outperform measurements of current conditions (commonly used to gauge resource availability and load) in terms of prediction accuracy. In addition, the techniques we have employed are almost as accurate as substantially more complex modeling methods. We compare our techniques to a sophisticated time-series analysis system in terms of forecasting accuracy and computational complexity.","PeriodicalId":315276,"journal":{"name":"ACM/IEEE SC 1997 Conference (SC'97)","volume":"134 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127814040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 143
Issues in the Design of a Flexible Distributed Architecture for Supporting Persistence and Interoperability in Collaborative Virtual Environments 支持协作虚拟环境中持久性和互操作性的灵活分布式体系结构设计中的问题
ACM/IEEE SC 1997 Conference (SC'97) Pub Date : 1997-11-15 DOI: 10.1145/509593.509614
J. Leigh
{"title":"Issues in the Design of a Flexible Distributed Architecture for Supporting Persistence and Interoperability in Collaborative Virtual Environments","authors":"J. Leigh","doi":"10.1145/509593.509614","DOIUrl":"https://doi.org/10.1145/509593.509614","url":null,"abstract":"CAVERN, the CAVE Research Network, is an alliance of industrial and research institutions equipped with CAVE-based virtual reality hardware and high performance computing resources, interconnected by high-speed networks, to support collaboration in design, education, engineering, and scientific visualization. CAVERNsoft is the collaborative software backbone for CAVERN. CAVERNsoft uses distributed data stores and multiple networking interfaces to provide persistence, customizable latency, data consistency, and scalability that are typically needed to support collaborative virtual reality.","PeriodicalId":315276,"journal":{"name":"ACM/IEEE SC 1997 Conference (SC'97)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131101713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 49
Compiling Parallel Code for Sparse Matrix Applications 编译并行代码稀疏矩阵应用程序
ACM/IEEE SC 1997 Conference (SC'97) Pub Date : 1997-11-15 DOI: 10.1145/509593.509603
V. Kotlyar, K. Pingali, Paul V. Stodghill
{"title":"Compiling Parallel Code for Sparse Matrix Applications","authors":"V. Kotlyar, K. Pingali, Paul V. Stodghill","doi":"10.1145/509593.509603","DOIUrl":"https://doi.org/10.1145/509593.509603","url":null,"abstract":"We have developed a framework based on relational algebra for compiling efficient sparse matrix code from dense DO-ANY loops and a specification of the representation of the sparse matrix. In this paper, we show how this framework can be used to generate parallel code, and present experimental data that demonstrates that the code generated by our Bernoulli compiler achieves performance competitive with that of hand-written codes for important computational kernels.","PeriodicalId":315276,"journal":{"name":"ACM/IEEE SC 1997 Conference (SC'97)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123095414","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
The Effects of Communication Parameters on End Performance of Shared Virtual Memory Clusters 通信参数对共享虚拟内存集群终端性能的影响
ACM/IEEE SC 1997 Conference (SC'97) Pub Date : 1997-11-15 DOI: 10.1145/509593.509594
A. Bilas, J. Singh
{"title":"The Effects of Communication Parameters on End Performance of Shared Virtual Memory Clusters","authors":"A. Bilas, J. Singh","doi":"10.1145/509593.509594","DOIUrl":"https://doi.org/10.1145/509593.509594","url":null,"abstract":"Recently there has been a lot of effort in providing cost-effective Shared Memory systems by employing software only solutions on clusters of high-end workstations coupled with high-bandwidth, low-latency commodity networks. Much of the work so far has focused on improving protocols, and there has been some work on restructuring applications to perform better on SVM systems. The result of this progress has been the promise for good performance on a range of applications at least in the 16-32 processor range. New system area networks and network interfaces provide significantly lower overhead, lower latency and higher bandwidth communication in clusters, inexpensive SMPs have become common as the nodes of these clusters, and SVM protocols are now quite mature. With this progress, it is now useful to examine what are the important system bottlenecks that stand in the way of effective parallel performance; in particular, which parameters of the communication architecture are most important to improve further relative to processor speed, which ones are already adequate on modern systems for most applications, and how will this change with technology in the future. Such information can assist system designers in determining where to focus their energies in improving performance, and users in determining what system characteristics are appropriate for their applications. We find that the most important system cost to improve is the overhead of generating and delivering interrupts. Improving network interface (and I/O bus) bandwidth relative to processor speed helps some bandwidth-bound applications, but currently available ratios of bandwidth to processor speed are already adequate for many others. Surprisingly, neither the processor overhead for handling messages nor the occupancy of the communication interface in preparing and pushing packets through the network appear to require much improvement.","PeriodicalId":315276,"journal":{"name":"ACM/IEEE SC 1997 Conference (SC'97)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122026428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 35
The Starfire SMP Interconnect Starfire SMP互连
ACM/IEEE SC 1997 Conference (SC'97) Pub Date : 1997-11-15 DOI: 10.1145/509593.509630
Alan E. Charlesworth, Nicholas E. Aneshansley, Mark Haakmeester, Dan Drogichen, Gary Gilbert, Ricki Williams, Andy Phelps
{"title":"The Starfire SMP Interconnect","authors":"Alan E. Charlesworth, Nicholas E. Aneshansley, Mark Haakmeester, Dan Drogichen, Gary Gilbert, Ricki Williams, Andy Phelps","doi":"10.1145/509593.509630","DOIUrl":"https://doi.org/10.1145/509593.509630","url":null,"abstract":"The Starfire Ultra Enterprise 10000 extends the envelope of UNIX SMP systems in several dimensions. Interconnect: It uses four address routers and a 16x16 data crossbar to provide 64 UltraSPARC processors with uniform-memory access at a bandwidth of 10,667 MBps. Flexibility: Starfire can be dynamically reconfigured into multiple hardware-protected operating system domains. Robustness: specially-connected workstation orchestrates service activities. Failing boards can be hot-swapped without interrupting system operation. ECC is carried on both address and data paths. Performance: Starfire has set several TPC-D decision- support benchmark records. It delivers a respectable 21 Gflops on Linpack-parallel.","PeriodicalId":315276,"journal":{"name":"ACM/IEEE SC 1997 Conference (SC'97)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122227311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
A Checkpointing Strategy for Scalable Recovery on Distributed Parallel Systems 分布式并行系统可扩展恢复的检查点策略
ACM/IEEE SC 1997 Conference (SC'97) Pub Date : 1997-11-15 DOI: 10.1145/509593.509625
V. Naik, S. Midkiff, J. Moreira
{"title":"A Checkpointing Strategy for Scalable Recovery on Distributed Parallel Systems","authors":"V. Naik, S. Midkiff, J. Moreira","doi":"10.1145/509593.509625","DOIUrl":"https://doi.org/10.1145/509593.509625","url":null,"abstract":"We describe a checkpoint/recovery scheme suitable for message-passing parallel applications. The novelty of our scheme is that checkpointed applications can be restored, from their checkpointed state, in reconfigured forms. Using this scheme, applications can quickly recover from partial system failures. A key component of our implementation is the distribution- independent representation of application array data structures in persistent storage. To further optimize the performance, we provide parallel array section streaming operations for distributed arrays. We compare the performance of the reconfigurable checkpoint/restart of parallel applications with that of conventional forms of checkpointing.","PeriodicalId":315276,"journal":{"name":"ACM/IEEE SC 1997 Conference (SC'97)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128966188","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Pentium Pro Inside: I. A Treecode at 430 Gigaflops on ASCI Red, II. Price/Performance of $50/Mflop on Loki and Hyglac Pentium Pro内部:1 .在ASCI Red上运行430千兆次浮点运算的Treecode;Loki和Hyglac的性价比为50美元/Mflop
ACM/IEEE SC 1997 Conference (SC'97) Pub Date : 1997-06-01 DOI: 10.1109/SC.1997.10057
Michael S. Warren, J. Salmon, D. Becker, M. Goda, T. Sterling, W. Winckelmans
{"title":"Pentium Pro Inside: I. A Treecode at 430 Gigaflops on ASCI Red, II. Price/Performance of $50/Mflop on Loki and Hyglac","authors":"Michael S. Warren, J. Salmon, D. Becker, M. Goda, T. Sterling, W. Winckelmans","doi":"10.1109/SC.1997.10057","DOIUrl":"https://doi.org/10.1109/SC.1997.10057","url":null,"abstract":"We present results from two methods of solving the gravitational N-body problem on ASCI Red. The first method, a trivial O(N^2) algorithm, obtained 635 Gflops for a 1 million particle problem on 6800 Pentium Pro processors. The second method, a treecode which scales as O(N log N), sustained 170 Gflops over a continuous 9.4 hour period on 4096 processors and 430 Gflops on 6800 processors during the initial part of the simulation. We also present two simulations which sustained roughly one Gigaflop on each of two 16 processor Beowulf-class computers constructed entirely from commodity personal computer technology for $50k each in September, 1996.","PeriodicalId":315276,"journal":{"name":"ACM/IEEE SC 1997 Conference (SC'97)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128030183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 49
Divide and Conquer Spot Noise 分而治之点噪
ACM/IEEE SC 1997 Conference (SC'97) Pub Date : 1900-01-01 DOI: 10.1145/509593.509612
W. D. de Leeuw
{"title":"Divide and Conquer Spot Noise","authors":"W. D. de Leeuw","doi":"10.1145/509593.509612","DOIUrl":"https://doi.org/10.1145/509593.509612","url":null,"abstract":"The design and implementation of an interactive spot noise algorithm is presented. Spot noise is a technique that uses texture for the visualization of flow fields. Various design tradeoffs are discussed that allow an optimal implementation on a range of high-end graphical workstations. Two applications are given: the steering of a smog prediction simulation and browsing a very large data set resulting from a direct numerical simulation of turbulence. These applications provide the motivation for the need of interactive visualization techniques.","PeriodicalId":315276,"journal":{"name":"ACM/IEEE SC 1997 Conference (SC'97)","volume":"260 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114085985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信