{"title":"Implementing on-line techniques to allocate file resources in large distributed systems","authors":"E. Pagani, G. P. Rossi","doi":"10.1109/EMPDP.2001.905065","DOIUrl":"https://doi.org/10.1109/EMPDP.2001.905065","url":null,"abstract":"The allocation of shared resources in a distributed system is a key aspect to achieve both low latency in accessing the resource and low bandwidth consumption. When the set of users accessing a resource dynamically changes, the allocation policy should adapt the resource placement over time. In the literature, on-line algorithms have been proposed that dynamically relocate a file in the centre of gravity of the set of users that are more frequently accessing it. However solar those algorithms have not been implemented. They must be adapted to work in real systems, and their interactions must be investigated with the existing network protocols and applications. In this work we study the behaviours of some of those algorithms in real environments. To this purpose, we devised an implementation scheme, and we measured the algorithms performance in real settings.","PeriodicalId":262971,"journal":{"name":"Proceedings Ninth Euromicro Workshop on Parallel and Distributed Processing","volume":"95 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114646393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A hierarchical computation model for distributed shared-memory machines","authors":"T. Rauber, G. Rünger","doi":"10.1109/EMPDP.2001.905011","DOIUrl":"https://doi.org/10.1109/EMPDP.2001.905011","url":null,"abstract":"We present a computation model to describe a clustered memory hierarchy of distributed shared memory machines. The computation model includes the access to shared data stored in different levels of the hierarchy as well as the transfer of entire blocks of data between different levels of the memory. Pure shared memory machines and pure message passing machines can be expressed within the model. As example we use the model to analyze a hierarchical matrix multiplication algorithm.","PeriodicalId":262971,"journal":{"name":"Proceedings Ninth Euromicro Workshop on Parallel and Distributed Processing","volume":"2000 5","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120849075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Communication policies performance: a case study","authors":"D. Tessera, A. Dubey","doi":"10.1109/EMPDP.2001.905080","DOIUrl":"https://doi.org/10.1109/EMPDP.2001.905080","url":null,"abstract":"Communication activities are one of the most influential factors for the performance of parallel applications, and usually limit the number of processors that can be profitably allocated. Two components usually determine the communication cost of a parallel algorithm. One is the volume and range of data transfer, which is inherent to a specific algorithm. The other is the choice of communication strategy, e.g., point-to-point versus collective exchanges, blocking versus non blocking protocols, which has impact on setup costs, overheads due to buffering and/or contentions. Knowledge of comparative performance of different strategies can be very useful for a user if several choices are available. In this article we present the results of a study to determine the best approach to high volume, long range communications within the frame work of multidimensional FFT algorithm. We have investigated five widely used communication strategies, available in the MPI standard, which have identical data volumes and range of communications. We also present a systematic analysis of the causes of performance differences, with analytical models supporting the experimental evidence.","PeriodicalId":262971,"journal":{"name":"Proceedings Ninth Euromicro Workshop on Parallel and Distributed Processing","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125453241","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-level parallelism in the block-Jacobi SVD algorithm","authors":"G. Okša, M. Vajtersic","doi":"10.1109/EMPDP.2001.905057","DOIUrl":"https://doi.org/10.1109/EMPDP.2001.905057","url":null,"abstract":"We analyse the fine-grained parallelism of the two-sided block-Jacobi algorithm for the singular value decomposition (SVD) of matrix A/spl isin/R/sup m/spl times/n/, m/spl ges/n. The algorithm involves the class CO of parallel orderings on the two-dimensional toroidal mesh with p processors. The mathematical background is based on the QR decomposition (QRD) of local data matrices and on the triangular Kogbetliantz algorithm (TKA) for local SVDs in the diagonal mesh processors. Subsequent updates of local matrices in the diagonal as well as nondiagonal mesh processors are required. WE show that all updates can be realized by orthogonal modified Givens rotations. These rotations can be efficiently pipelined in parallel in the horizontal and vertical rings of /spl radic/p processors through the toroidal mesh. For one mesh processor our solution requires O[(m+n)/sup 2///sub p/] systolic processing elements (PEs). O(m/sup 2//p) local memory registers and O[(m+n)/sup 2//p] additional delay elements. The time complexity of our solution is O[(m+n/sup 3/2//p/sup 3/4/)/spl Delta/] time steps per one global iteration where /spl Delta/ is the length of the global synchronization time step that is given by evaluation and application of two modified Givens rotations in TKA.","PeriodicalId":262971,"journal":{"name":"Proceedings Ninth Euromicro Workshop on Parallel and Distributed Processing","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131110485","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"What groupware functionality do users really use? Analysis of the usage of the BSCW system","authors":"W. Appelt","doi":"10.1109/EMPDP.2001.905060","DOIUrl":"https://doi.org/10.1109/EMPDP.2001.905060","url":null,"abstract":"The BSCW (Basic Support for Cooperative Work) Shared Workspace System is a Web based groupware system developed at GMD that is used by more than 100,000 users world-wide. This paper describes an analysis of the actual usage of the features provided by the system, based on a statistical evaluation of the logfile of the BSCW server http://bscw.gmd.de over a period of more than ten months.","PeriodicalId":262971,"journal":{"name":"Proceedings Ninth Euromicro Workshop on Parallel and Distributed Processing","volume":"237 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131873304","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
R. Gonçalves, M. Pilla, G. D. Pizzol, T. Santos, P. Navaux, R. Santos
{"title":"Evaluating the effects of branch prediction accuracy on the performance of SMT architectures","authors":"R. Gonçalves, M. Pilla, G. D. Pizzol, T. Santos, P. Navaux, R. Santos","doi":"10.1109/EMPDP.2001.905062","DOIUrl":"https://doi.org/10.1109/EMPDP.2001.905062","url":null,"abstract":"Branch instruction occurrence reduces the parallelism exploited from the source code of single-threaded applications. In order to reduce the branch penalty, several branch predictor techniques have been proposed. Branch predictors allow the fetch unit to continue fetching instructions along a predicted path after a conditional branch has been detected. Such techniques, when used in conventional superscalar architectures, may reach more than 95% of accuracy. These same techniques are also used in SMT architectures. However, SMT architectures may have a different behavior due to the parallelism exploration in several threads. Moreover, the effects supported by one thread may influence also the performance of other threads. In this work, we vary the accuracy of the branch predictor in order to evaluate the impact on the performance of a SMT architecture. Even though the SMT and superscalar have a different behavior, we observed that the effect of the improvement in the prediction accuracy is similar for both architectures.","PeriodicalId":262971,"journal":{"name":"Proceedings Ninth Euromicro Workshop on Parallel and Distributed Processing","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115919826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Design of a VIA based communication protocol for LAM/MPI suite","authors":"M. Bertozzi, M. Panella, M. Reggiani","doi":"10.1109/EMPDP.2001.904967","DOIUrl":"https://doi.org/10.1109/EMPDP.2001.904967","url":null,"abstract":"The increasing use of System Area Network (SAN) demands efficient communication to benefit SAN features through a direct access to network resources and avoiding kernel intervention in communication path. Recently, a consortium composed by Microsoft, Compaq and Intel authored a new standard, the Virtual Interface Architecture (VIA), designed to reduce software overhead in data transfers. This paper describes the communication protocol proposed in order to allow a complete implementation of MPI based on VIA. This protocol is needed because the plain use of the two VIA data transfer models does not allow the implementation of MPI based on VIA, due to the large number of MPI communication flavors. To validate the goodness of the proposed protocol, a new communication layer based on VIA has been introduced in the LAM/MPI suite. The reported results, referring to a software VIA implementation for fast Ethernet networks, exhibits a significant reduction in latency time of LAM/MPI based on VIA with respect to the same library based on the TCP/IP protocol.","PeriodicalId":262971,"journal":{"name":"Proceedings Ninth Euromicro Workshop on Parallel and Distributed Processing","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132058735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Dynamic re-configurable transaction management in AgentTeam","authors":"Bora I. Kumova","doi":"10.1109/EMPDP.2001.905051","DOIUrl":"https://doi.org/10.1109/EMPDP.2001.905051","url":null,"abstract":"With respect to data format and data consistency, the spectrum of DDBM ranges from management of relational data in distributed database management systems (DDBMS) to the management of non-relational data in information retrieval systems. On the other hand, in many networked application domains users need to access any type of data, preferably with a single tool. However, DDBM is still a challenging research area that involves bridging syntactic and semantic heterogeneity of data as well as of functionality, especially in the bottom-up design of a new DDB. Since, existing DDBM systems were usually built by focusing on the implementation of some dedicated protocols for DDBM, they are inflexible for major modifications or exchange of the protocols. In addition, their software architecture usually does not comply with well-known design paradigms, which could facilitate the maintenance of the software system. We present an agent-based approach for flexible DDBM, where independent DDBM protocols are modular exchangeable for lest purposes. This capability of the DDBMS provides for flexibility against protocol heterogeneity and enables the DDBMS to communicate with new DBMSs to be included, by combining or adapting its protocols at run time. For instance, different DDB consistency levels are possible for different DDBs with the same DDBMS. In this work, the design of DDBM in form of a multi-agent system of the AgentTeam framework is discussed, particularly the dynamic reconfigurable transaction management model. Furthermore, the CourseMan prototype is described, which is implemented in form of a test-bed that provides a homogeneous environment for testing different DDBM protocols, upon heterogeneous DBMSs.","PeriodicalId":262971,"journal":{"name":"Proceedings Ninth Euromicro Workshop on Parallel and Distributed Processing","volume":"130 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127586723","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
P. Bellavista, Antonio Corradi, Domenico Cotroneo, S. Russo
{"title":"Integrating mobile agent infrastructures with CORBA-based distributed multimedia applications","authors":"P. Bellavista, Antonio Corradi, Domenico Cotroneo, S. Russo","doi":"10.1109/EMPDP.2001.905034","DOIUrl":"https://doi.org/10.1109/EMPDP.2001.905034","url":null,"abstract":"The increased computing power and the enhanced connectivity of current open computing systems are encouraging the deployment of new classes of services both centered around dynamically changing user requirements and based on the exploitation of the Internet infrastructure. Distributed Multimedia Applications (DMAs) are a typical class of services with challenging requirements in terms of resource demand, dynamicity and QoS adaptation. The paper claims that distributed objects and mobile agents can complement each other to provide a flexible middleware for DMAs, and describes the case study of MADAMA (Mobile Agent-based Distributed Architecture for Multimedia Applications). MADAMA adopts mobile agents to simplify the distribution of service control and to provide location-aware adaptability. In addition, MADAMA is compliant with CORBA to achieve large accessibility and interoperability.","PeriodicalId":262971,"journal":{"name":"Proceedings Ninth Euromicro Workshop on Parallel and Distributed Processing","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114995033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Adding flexibility and real-time performance by adapting a single processor industrial application to a multiprocessor platform","authors":"Leif Enblom, L. Lindh","doi":"10.1109/EMPDP.2001.905079","DOIUrl":"https://doi.org/10.1109/EMPDP.2001.905079","url":null,"abstract":"This paper describes a way to get more flexibility in a real-time product and its base platform (real-time operating system and hardware). Industrial hardware and software platforms are due to change and in some cases a new platform is needed after five to ten years, if not earlier. This is costly and there is a need to be able to make the product grow in performance without changing the platform. The ongoing work that is described in this paper is performed in cooperation with industry and the attempt is to convert a single processor software application to a multiprocessor application. By changing the platform to a flexible multiprocessor real-time problem, flexibility and performance is increased, resulting in a more optimized platform for different configurations of the application.","PeriodicalId":262971,"journal":{"name":"Proceedings Ninth Euromicro Workshop on Parallel and Distributed Processing","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114076107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}