International Conference on Parallel Processing, 2004. ICPP 2004.最新文献

筛选
英文 中文
Evaluating the scalability of Java event-driven Web servers 评估Java事件驱动Web服务器的可伸缩性
International Conference on Parallel Processing, 2004. ICPP 2004. Pub Date : 2004-08-15 DOI: 10.1109/ICPP.2004.34
Vicencc Beltran, David Carrera, J. Torres, E. Ayguadé
{"title":"Evaluating the scalability of Java event-driven Web servers","authors":"Vicencc Beltran, David Carrera, J. Torres, E. Ayguadé","doi":"10.1109/ICPP.2004.34","DOIUrl":"https://doi.org/10.1109/ICPP.2004.34","url":null,"abstract":"The two major strategies used to construct high-performance Web servers are thread pools and event-driven architectures. The Java platform is commonly used in Web environments but up to the moment it did not provide any standard API to implement event-driven architectures efficiently. The new 1.4 release of the J2SE introduces the NIO (New I/O) API to help in the development of event-driven I/O intensive applications. We evaluate the scalability that this API provides to the Java platform in the field of Web servers, bringing together the majorly used commercial server (Apache) and one experimental server developed using the NIO API. We study the scalability of the NIO-based server as well as of its rival in a number of different scenarios, including uniprocessor, multiprocessor, bandwidth-bounded and CPU-bounded environments. The study concludes that the NIO API can be successfully used to create event-driven Java servers that can scale as well as the best of the commercial native-compiled Web server, at a fraction of its complexity and using only one or two worker threads.","PeriodicalId":106240,"journal":{"name":"International Conference on Parallel Processing, 2004. ICPP 2004.","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133611660","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Using hardware operations to reduce the synchronization overhead of task pools 使用硬件操作来减少任务池的同步开销
International Conference on Parallel Processing, 2004. ICPP 2004. Pub Date : 2004-08-15 DOI: 10.1109/ICPP.2004.1327927
Ralf Hoffmann, Matthias Korch, T. Rauber
{"title":"Using hardware operations to reduce the synchronization overhead of task pools","authors":"Ralf Hoffmann, Matthias Korch, T. Rauber","doi":"10.1109/ICPP.2004.1327927","DOIUrl":"https://doi.org/10.1109/ICPP.2004.1327927","url":null,"abstract":"We consider the task-based execution of parallel irregular applications, which are characterized by an unpredictable computational structure induced by the input data. The dynamic load balancing required to execute such applications efficiently can be provided by task pools. Thus, the performance of a task-based irregular application is tightly coupled to the scalability and the overhead of the task pool used to execute it. In order to reduce this overhead this article considers the use of the hardware-specific synchronization operations compare & swap and load & reserve/store conditional. We present several different realizations of task pools using these operations. Runtime experiments on two shared-memory machines, a SunFire 6800 and an IBM p690, show that the new implementations obtain a significantly higher performance than implementations relying on the POSIX thread library for synchronization.","PeriodicalId":106240,"journal":{"name":"International Conference on Parallel Processing, 2004. ICPP 2004.","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133623614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Runtime system for autonomic rescheduling of MPI programs 自主重调度MPI程序的运行时系统
International Conference on Parallel Processing, 2004. ICPP 2004. Pub Date : 2004-08-15 DOI: 10.1109/ICPP.2004.1327898
C. Du, Sudeshna Ghosh, S. Shankar, Xian-He Sun
{"title":"Runtime system for autonomic rescheduling of MPI programs","authors":"C. Du, Sudeshna Ghosh, S. Shankar, Xian-He Sun","doi":"10.1109/ICPP.2004.1327898","DOIUrl":"https://doi.org/10.1109/ICPP.2004.1327898","url":null,"abstract":"Intensive research has been conducted on dynamic job scheduling, which dynamically allocates jobs to computing systems. However, most of the existing work is limited to redistribute independent tasks or at the algorithm design level. There is no runtime system available to support automatic redistribution of a running process in a heterogeneous network environment. In this study, we present the design and implementation of a system that dynamically reschedules running processes over a network of computing resources via automatic decision-making and process migration. The system is implemented on top of MPI-2 and HPCM (high performance computing mobility) middleware. Experimental and analytical results show that the runtime system works well. It makes dynamic rescheduling of running tasks possible and improves system performance considerably. While the implementation is for MPI programs and using HPCM, the design of the system is general and can be extended to other distributed environments as well.","PeriodicalId":106240,"journal":{"name":"International Conference on Parallel Processing, 2004. ICPP 2004.","volume":"101 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122544431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Architectural characterization of an XML-centric commercial server workload 以xml为中心的商业服务器工作负载的体系结构特征
International Conference on Parallel Processing, 2004. ICPP 2004. Pub Date : 2004-08-15 DOI: 10.1109/ICPP.2004.1327935
P. Apparao, R. Iyer, R. Morin, Naren Nayak, M. Bhat, D. Halliwell, W. Steinberg
{"title":"Architectural characterization of an XML-centric commercial server workload","authors":"P. Apparao, R. Iyer, R. Morin, Naren Nayak, M. Bhat, D. Halliwell, W. Steinberg","doi":"10.1109/ICPP.2004.1327935","DOIUrl":"https://doi.org/10.1109/ICPP.2004.1327935","url":null,"abstract":"As XML (extensible markup language) rapidly emerges as the standard for information storage and communication, it becomes increasingly important to understand its architectural characteristics and performance implications. In This work, our goal is to characterize a representative XML-based server in a managed runtime environment such as Java. Based on detailed measurements on an Intel/spl reg/ XeonTM processor-based commercial server running a real-world XML-based server workload, we start by looking at symmetric multiprocessor (SMP) scaling characteristics and the benefits of hyper-threading technology. Using performance monitoring events provided on the processor, we present an overview of the architectural characteristics (such as clocks per instruction (CPI), cache miss rates, memory/bus utilization, branch behavior and efficiency). Using profiling tools like Intel/spl reg/ VTuneTM performance analyzer, we map these architectural/performance characteristics to the various components of application execution - helping us identify hot spots and propose potential enhancements to code generation and application software. We believe that the information presented Are useful in understanding the XML processing characteristics and may serve as a useful first step to identifying potential hardware/software optimizations for improved future performance.","PeriodicalId":106240,"journal":{"name":"International Conference on Parallel Processing, 2004. ICPP 2004.","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123227083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Complexity results and heuristics for pipelined multicast operations on heterogeneous platforms 异构平台上管道组播操作的复杂度结果和启发式算法
International Conference on Parallel Processing, 2004. ICPP 2004. Pub Date : 2004-08-15 DOI: 10.1109/ICPP.2004.1327931
Olivier Beaumont, Arnaud Legrand, L. Marchal, Y. Robert
{"title":"Complexity results and heuristics for pipelined multicast operations on heterogeneous platforms","authors":"Olivier Beaumont, Arnaud Legrand, L. Marchal, Y. Robert","doi":"10.1109/ICPP.2004.1327931","DOIUrl":"https://doi.org/10.1109/ICPP.2004.1327931","url":null,"abstract":"We consider the communications involved by the execution of a complex application deployed on a heterogeneous platform. Such applications extensively use macro-communication schemes, such as multicast operations, where messages are broadcast to a set of predefined targets. We assume that there are a large number of messages to be multicast in pipeline fashion, and we seek to maximize the throughput of the steady-state operation. We target heterogeneous platforms, modeled by a graph where links have different communication speeds. We show that the problem of computing the best throughput for a multicast operation is NP-hard, whereas the best throughput to broadcast a message to every node in a graph can be computed in polynomial time. Thus, we introduce several heuristics to deal with this problem and prove that some of them are approximation algorithms. We perform, simulations to test these heuristics and show that their results are close to a theoretical upper bound on the throughput that we obtain with a linear programming approach.","PeriodicalId":106240,"journal":{"name":"International Conference on Parallel Processing, 2004. ICPP 2004.","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114816245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Parallel software for inductance extraction 电感提取并行软件
International Conference on Parallel Processing, 2004. ICPP 2004. Pub Date : 2004-08-15 DOI: 10.1109/ICPP.2004.1327946
H. Mahawar, V. Sarin
{"title":"Parallel software for inductance extraction","authors":"H. Mahawar, V. Sarin","doi":"10.1109/ICPP.2004.1327946","DOIUrl":"https://doi.org/10.1109/ICPP.2004.1327946","url":null,"abstract":"The next generation VLSI circuits will be designed with millions of densely packed interconnect segments on a single chip. Inductive effects between these segments begin to dominate signal delay as the clock frequency is increased. Modern parasitic extraction tools to estimate the onchip inductive effects with high accuracy have had limited impact due to large computational and storage requirements. This work describes a parallel software package for inductance extraction called ParIS, which is capable of analyzing interconnect configurations involving several conductors within reasonable time. The main component of the software is a novel preconditioned iterative method that is used to solve a dense complex linear system of equations. The linear system represents the inductive coupling between filaments that are used to discretize the conductors. A variant of the fast multipole method is used to compute dense matrix-vector products with the coefficient matrix. ParIS uses a two-tier parallel formulation that allows mixed mode parallelization using both MPIand OpenMP. An MPI process is associated with each conductor. The computation within a conductor is parallelized using OpenMP. The parallel efficiency and scalability of the software is demonstrated through experiments on the IBM p690 and Intel and AMD Linux clusters. These experiments highlight the portability and efficiency of the software on multiprocessors with shared, distributed, and distributed-shared memory architectures.","PeriodicalId":106240,"journal":{"name":"International Conference on Parallel Processing, 2004. ICPP 2004.","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115926851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Robust resource allocation for sensor-actuator distributed computing systems 传感器-执行器分布式计算系统的鲁棒资源分配
International Conference on Parallel Processing, 2004. ICPP 2004. Pub Date : 2004-08-15 DOI: 10.1109/ICPP.2004.1327919
Shoukat Ali, A. A. Maciejewski, H. Siegel, Jong-Kook Kim
{"title":"Robust resource allocation for sensor-actuator distributed computing systems","authors":"Shoukat Ali, A. A. Maciejewski, H. Siegel, Jong-Kook Kim","doi":"10.1109/ICPP.2004.1327919","DOIUrl":"https://doi.org/10.1109/ICPP.2004.1327919","url":null,"abstract":"This research investigates two distinct issues related to a resource allocation: its robustness and the failure rate of the heuristic used to determine the allocation. The target system consists of a number of sensors feeding a set of heterogeneous applications continuously executing on a set of heterogeneous machines connected together by high-speed heterogeneous links. There are number of quality of service (QoS) constraints that must be satisfied. A heuristic failure occurs if the heuristic cannot find an allocation that allows the system to meet its QoS constraints. The system is expected to operate in an uncertain environment where the workload, i.e., the load presented by the set of sensors, is likely to change unpredictably, possibly invalidating a resource allocation that was based on the initial workload estimate. The focus of this paper is the design of a static heuristic that: (a) determines a robust resource allocation, i.e., a resource allocation that maximizes the allowable increase in workload until a run-time reallocation of resources is required to avoid a QoS violation, and (b) has a very low failure rate. This study proposes a heuristic that performs well with respect to the failure rates and robustness to unpredictable workload increases. This heuristic is, therefore, very desirable for systems where low failure rates can be a critical requirement and where unpredictable circumstances can lead to unknown increases in the system workload.","PeriodicalId":106240,"journal":{"name":"International Conference on Parallel Processing, 2004. ICPP 2004.","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116589646","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Faucets: efficient resource allocation on the computational grid 水龙头:计算网格上有效的资源分配
International Conference on Parallel Processing, 2004. ICPP 2004. Pub Date : 2004-08-15 DOI: 10.1109/ICPP.2004.1327948
L. Kalé, Sameer Kumar, M. Potnuru, J. Desouza, S. Bandhakavi
{"title":"Faucets: efficient resource allocation on the computational grid","authors":"L. Kalé, Sameer Kumar, M. Potnuru, J. Desouza, S. Bandhakavi","doi":"10.1109/ICPP.2004.1327948","DOIUrl":"https://doi.org/10.1109/ICPP.2004.1327948","url":null,"abstract":"The idea of a \"computational grid\" suggests that high end computational power can be thought of as a utility, similar to electricity or water. Making this metaphor work requires a sophisticated \"power distribution\" infrastructure. We present the Faucets framework that aims at providing (a) user-friendly compute power distribution across the grid, (b) market-driven selection of compute servers for each job, resulting in effective utilization of resources across the grid, and (c) improved utilization within individual compute servers. Utilization of individual compute servers is improved by the notions of adaptive jobs and smarter job schedulers. Server selection is facilitated by quality-of-service (QoS) contracts for parallel jobs. Market efficiencies are then attained by a bidding and evaluation system that makes the compute servers compete for every job by submitting bids, thus transforming the computational grid into a free market. Job submission and monitoring is simplified by several tools and databases within the Faucets system. We describe the overall architecture of the system. All the essential components of the system have been implemented, which are described In the work. We also discuss ongoing work and future research issues.","PeriodicalId":106240,"journal":{"name":"International Conference on Parallel Processing, 2004. ICPP 2004.","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116640382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 52
Adaptive data partition for sorting using probability distribution 基于概率分布的自适应数据分区排序
International Conference on Parallel Processing, 2004. ICPP 2004. Pub Date : 2004-08-15 DOI: 10.1109/ICPP.2004.1327928
Xipeng Shen, C. Ding
{"title":"Adaptive data partition for sorting using probability distribution","authors":"Xipeng Shen, C. Ding","doi":"10.1109/ICPP.2004.1327928","DOIUrl":"https://doi.org/10.1109/ICPP.2004.1327928","url":null,"abstract":"Many computing problems benefit from dynamic partition of data into smaller chunks with better parallelism and locality. However, it is difficult to partition all types of inputs with the same high efficiency. This paper presents a new partition method in sorting scenario based on probability distribution, an idea first studied by Janus and Lamagna in early 1980's on a mainframe computer. The new technique makes three improvements. The first is a rigorous sampling technique that ensures accurate estimate of the probability distribution. The second is an efficient implementation on modern, cache-based machines. The last is the use of probability distribution in parallel sorting. Experiments show 10-30% improvement in partition balance and 20-70% reduction in partition overhead, compared to two commonly used techniques. The new method reduces the parallel sorting time by 33-50% and outperforms the previous fastest sequential sorting technique by up to 30%.","PeriodicalId":106240,"journal":{"name":"International Conference on Parallel Processing, 2004. ICPP 2004.","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130115060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
A future of parallel computer architectures 并行计算机体系结构的未来
International Conference on Parallel Processing, 2004. ICPP 2004. Pub Date : 2004-08-15 DOI: 10.1109/ICPP.2004.1327896
M. Hill
{"title":"A future of parallel computer architectures","authors":"M. Hill","doi":"10.1109/ICPP.2004.1327896","DOIUrl":"https://doi.org/10.1109/ICPP.2004.1327896","url":null,"abstract":"The document was not made available for publication as part of the conference proceedings.","PeriodicalId":106240,"journal":{"name":"International Conference on Parallel Processing, 2004. ICPP 2004.","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126864309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信