{"title":"Degree hunter: on the impact of balancing node degrees in de Bruijn-based overlay networks","authors":"P. Fraigniaud, Hoang-Anh Phan","doi":"10.1109/IPDPSW.2010.5470929","DOIUrl":"https://doi.org/10.1109/IPDPSW.2010.5470929","url":null,"abstract":"This paper presents several mechanisms for balancing the node degrees in de Bruijn-based overlay networks for peer-to-peer systems. One of these mechanisms is shown to perform almost as well as an ideal centralized mechanism, but it is based on the size of the key-spaces assigned to the nodes, and thus it may interfere with protocols aiming at balancing the load of the nodes. We therefore present two other mechanisms that are solely based on the structure of the connections between the nodes. The performances of these two mechanisms depend on the environment. One of them achieves the best performances in file-sharing systems, while the other achieves the best performances in media streaming systems.","PeriodicalId":329280,"journal":{"name":"2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123881670","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Characterizing heterogeneous computing environments using singular value decomposition","authors":"Abdulla Al-Qawasmeh, A. A. Maciejewski, H. Siegel","doi":"10.1109/IPDPSW.2010.5470875","DOIUrl":"https://doi.org/10.1109/IPDPSW.2010.5470875","url":null,"abstract":"We consider a heterogeneous computing environment that consists of a collection of machines and task types. The machines vary in capabilities and different task types are better suited to specific machine architectures. We describe some of the difficulties with the current measures that are used to characterize heterogeneous computing environments and propose two new measures. These measures relate to the aggregate machine performance (relative to the given task types) and the degree of affinity that specific task types have to different machines. The latter measure of task-machine affinity is quantified using singular value decomposition. One motivation for using these new measures is to be able to represent a wider range of heterogeneous environments than is possible with previous techniques. An important application of studying the heterogeneity of heterogeneous systems is predicting the performance of different computing hardware for a given task type mix.","PeriodicalId":329280,"journal":{"name":"2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123617985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Konstantin Pussep, Osama Abboud, Florian Gerlach, R. Steinmetz, T. Strufe
{"title":"Adaptive server allocation for peer-assisted Video-on-Demand","authors":"Konstantin Pussep, Osama Abboud, Florian Gerlach, R. Steinmetz, T. Strufe","doi":"10.1109/IPDPSW.2010.5470927","DOIUrl":"https://doi.org/10.1109/IPDPSW.2010.5470927","url":null,"abstract":"Dedicated servers are an undesirable but inevitable resource in peer-assisted streaming systems. Their provision is necessary to guarantee a satisfying quality of experience to consumers, yet they cause significant, and largely avoidable cost for the provider, which can be minimized. We propose two adaptive server allocation schemes that estimate the capacity situation and service demand of the system to adaptively optimize allocated resources. Extensive simulations support the efficiency of our approach, which, without considering any prior knowledge, allows achieving a competitive performance compared to systems that are well dimensioned using global knowledge.","PeriodicalId":329280,"journal":{"name":"2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114535496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lakshminarasimhan Seshagiri, Meng-Shiou Wu, M. Sosonkina, Zhao Zhang, M. Gordon, Michael W. Schmidt
{"title":"Enhancing adaptive middleware for quantum chemistry applications with a database framework","authors":"Lakshminarasimhan Seshagiri, Meng-Shiou Wu, M. Sosonkina, Zhao Zhang, M. Gordon, Michael W. Schmidt","doi":"10.1109/IPDPSW.2010.5470760","DOIUrl":"https://doi.org/10.1109/IPDPSW.2010.5470760","url":null,"abstract":"Quantum chemistry applications such as the General Atomic and Molecular Electronic Structure System (GAMESS) that can execute on a complex peta-scale parallel computing environment has a large number of input parameters that affect the overall performance. The application characteristics vary according to the input parameters. This is due to the difference in the usage of resources like network bandwidth, I/O and main memory, according to the input parameters. Effective execution of applications in a parallel computing environment that share such resources require some sort of adaptive mechanism to enable efficient usage of these resources. In our previous work, we have integrated GAMESS with an adaptive middleware NICAN (Network Information Conveyer and Application Notification) for dynamic adaptations during heavy load conditions that modify execution of GAMESS computations on a per-iteration basis. This leads to better application performance. In this research, we have expanded the structure of NICAN in order to include other input parameters based on which application performance can be controlled. The application performance has been analyzed on different architectures and a tuning strategy has been identified. A generic database framework has been incorporated in the existing NICAN mechanism so as to aid this tuning strategy.","PeriodicalId":329280,"journal":{"name":"2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116205180","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
T. Frangieh, A. Chandrasekharan, S. Rajagopalan, Yousef Iskander, S. Craven, C. Patterson
{"title":"PATIS: Using partial configuration to improve static FPGA design productivity","authors":"T. Frangieh, A. Chandrasekharan, S. Rajagopalan, Yousef Iskander, S. Craven, C. Patterson","doi":"10.1109/IPDPSW.2010.5470755","DOIUrl":"https://doi.org/10.1109/IPDPSW.2010.5470755","url":null,"abstract":"Reconfigurable hardware development and debugging tools aspire to provide software-like productivity. A major impediment, however, is the lack of a module linkage capability permitting hardware blocks to be compiled concurrently, limiting the effective use of multi-core and multiprocessor platforms. Although modular and incremental design flows can reuse the layouts of unmodified blocks, non-local changes to the logical hierarchy or physical layout, or addition of debug circuitry, generally force complete re-implementation. We describe the PATIS dynamic floorplanner, targeting development environments in which some circuit speed and area optimization may be sacrificed for improved implementation and debug turnaround. The floorplan consists of partial modules with structured physical interfaces observable through configuration readback rather than synthesized logic analysis circuitry, allowing module ports to be passively probed without disturbing the layout. Although PATIS supports incremental design, complete re-implementation is still rapid because the partial bitstream for each block is generated by independent and concurrent invocations of the standard Xilinx tools running on separate cores or hosts. A continuous background task proactively generates floorplan variants to accelerate global layout changes. The partial reconfiguration design flow is easier to automate in PATIS because run-time module swapping is not required, suggesting that partial reconfiguration may serve a useful role in large-scale static design.","PeriodicalId":329280,"journal":{"name":"2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW)","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114868236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Proceedings of the 15th international workshop on high-level parallel programming models and supportive environments","authors":"F. Wolf","doi":"10.1109/IPDPSW.2010.5470882","DOIUrl":"https://doi.org/10.1109/IPDPSW.2010.5470882","url":null,"abstract":"The 15th HIPS workshop, to be held as a full-day meeting at the IPDPS 2010 conference in Atlanta, focuses on high-level programming of multiprocessors, compute clusters, and massively parallel machines. Like previous workshops in the series, which was established in 1996, this event serves as a forum for researchers in the areas of parallel applications, language design, compilers, runtime systems, and programming tools.","PeriodicalId":329280,"journal":{"name":"2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW)","volume":"119 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124161756","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Large neighborhood local search optimization on graphics processing units","authors":"Thé Van Luong, N. Melab, E. Talbi","doi":"10.1109/IPDPSW.2010.5470889","DOIUrl":"https://doi.org/10.1109/IPDPSW.2010.5470889","url":null,"abstract":"Local search (LS) algorithms are among the most powerful techniques for solving computationally hard problems in combinatorial optimization. These algorithms could be viewed as ¿walks through neighborhoods¿ where the walks are performed by iterative procedures that allow to move from a solution to another one in the solution space. In these heuristics, designing operators to explore large promising regions of the search space may improve the quality of the obtained solutions at the expense of a highly computationally process. Therefore, the use of graphics processing units (GPUs) provides an efficient complementary way to speed up the search. However, designing applications on GPU is still complex and many issues have to be faced. We provide a methodology to design and implement large neighborhood LS algorithms on GPU. The work has been experimented for binary problems by deploying multiple neighborhood structures. The obtained results are convincing both in terms of efficiency, quality and robustness of the provided solutions at run time.","PeriodicalId":329280,"journal":{"name":"2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126323075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Extendable storage framework for reliable clustered storage systems","authors":"Sumit Narayan, J. Chandy","doi":"10.1109/IPDPSW.2010.5470801","DOIUrl":"https://doi.org/10.1109/IPDPSW.2010.5470801","url":null,"abstract":"The total amount of information stored on disks has increased tremendously in recent years with data storage, sharing and backup becoming more important than ever. The demand for storage has not only changed in size, but also in speed, reliability and security. These requirements create a big challenge for storage system architects who aim for a one system fits all design. Storage policies like backup and security are typically set for an entire file system. However, this granularity is too large and can sacrifice storage efficiency and performance, particularly since different files have different storage requirements. In this work, we provide a framework for an attribute-based extendable storage system which will allow storage policy decisions to be made at file-level granularity and at all levels of the storage stack, including file system, operating system, and device managers. We propose to do this by using a file's extended attributes that will enable different tasks via plugins or functions implemented at various levels within the storage stack and provide a complete data-aware storage functionality from an application point of view. We provide examples of how our framework can be used to improve performance in a reliable clustered storage system.","PeriodicalId":329280,"journal":{"name":"2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW)","volume":"C-35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126492654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Autonomic management of distributed systems using online clustering","authors":"Andres Quiroz, M. Parashar, I. Rodero","doi":"10.1109/IPDPSW.2010.5470725","DOIUrl":"https://doi.org/10.1109/IPDPSW.2010.5470725","url":null,"abstract":"Distributed computational infrastructures, as well as the applications and services that they support, are increasingly becoming an integral part of society and affecting every aspect of life. As a result, ensuring their efficient and robust operation is critical. However, the scale and overall complexity of these systems is growing at an alarming rate (current data centers contain tens to hundreds of thousands of computing and storage devices running complex applications), making the management of these systems extremely challenging and rapidly exceeding human capability. Furthermore, these systems require simultaneous management along multiple dimensions, including performance, quality of service, power, and reliability.","PeriodicalId":329280,"journal":{"name":"2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125661965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Out-of-core distribution sort in the FG programming environment","authors":"P. Natarajan, T. Cormen, E. Strange","doi":"10.1109/IPDPSW.2010.5470692","DOIUrl":"https://doi.org/10.1109/IPDPSW.2010.5470692","url":null,"abstract":"We describe the implementation of an out-of-core, distribution-based sorting program on a cluster using FG, a multithreaded programming framework. FG mitigates latency from disk-I/O and interprocessor communication by overlapping such high-latency operations with other operations. It does so by constructing and executing a coarse-grained software pipeline on each node of the cluster, where each stage of the pipeline runs in its own thread. The sorting program distributes data among the nodes to create sorted runs, and then it merges sorted runs on each node. When distributing data, the rates at which a node sends and receives data will differ. When merging sorted runs, each node will consume data from each of its sorted runs at varying rates. Under these conditions, a single pipeline running on each node is unwieldy to program and not necessarily efficient.We describe how we have extended FG to support multiple pipelines on each node in two forms. When a node might send and receive data at different rates during interprocessor communication, we use disjoint pipelines on each node: one pipeline to send and one pipeline to receive. When a node consumes and produces data from different streams on the node, we use multiple pipelines that intersect at a particular stage. Experimental results show that by using multiple pipelines, an out-of-core, distribution-based sorting program outperforms an out-of-core sorting program based on columnsort-taking approximately 75%–85% of the time-despite the advantages that the columnsort-based program holds.","PeriodicalId":329280,"journal":{"name":"2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125999246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}