C. D. Tsao, H. Hsu, Jyh-Yen Chen, J.-H. Huang, Shoou-Gwo Jiang, B. Lin
{"title":"The design and performance considerations for multimedia applications using FDDI synchronous services","authors":"C. D. Tsao, H. Hsu, Jyh-Yen Chen, J.-H. Huang, Shoou-Gwo Jiang, B. Lin","doi":"10.1109/ICPADS.1994.590096","DOIUrl":"https://doi.org/10.1109/ICPADS.1994.590096","url":null,"abstract":"In this paper, an architectural design and implementation of a multimedia conference system over FDDI networks is presented. In this development process, various critical design issues are considered and approaches are proposed to optimize the architectural design to yield acceptable performance. The issues include 1) multicasting capability, 2) design of high speed transport protocol, and 3) delay and jitters performance of FDDI synchronous service.","PeriodicalId":154429,"journal":{"name":"Proceedings of 1994 International Conference on Parallel and Distributed Systems","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121987324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Load balancing in pipelined processing of multi-join queries","authors":"Hongjun Lu, K. Tan, Chiang Lee","doi":"10.1109/ICPADS.1994.590427","DOIUrl":"https://doi.org/10.1109/ICPADS.1994.590427","url":null,"abstract":"Looks at how to effectively exploit pipelining for multi-join queries in shared-nothing systems. A multi-join query can be processed using an iterative approach. In each iteration, several relations are selected and are joined in a pipelined fashion. However, algorithms that are based on this approach have traditionally assumed that the relations are uniformly distributed or only slightly skewed. When this assumption is relaxed, i.e. when the data is skewed, some nodes may be assigned a larger amount of data than can fit into their memories. As such, pipelining cannot be effectively exploited, and performance may degenerate drastically. We propose four skew handling techniques to deal with data skew for multi-join queries. The results of a performance study show that a hybrid technique is superior in most cases.","PeriodicalId":154429,"journal":{"name":"Proceedings of 1994 International Conference on Parallel and Distributed Systems","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122493304","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Extracting multi-thread with data localities for vector computers","authors":"J. Sheu, Chih-Yung Chang","doi":"10.1109/ICPADS.1994.590357","DOIUrl":"https://doi.org/10.1109/ICPADS.1994.590357","url":null,"abstract":"In this paper, we propose a source-to-source compilation strategy to partition vectorized loop programs into multithread execution form. Each partitioned thread consists of instances of statements with localities in vector registers. The multi-threading scheme gives a novel combination of loop unrolling, statement instances reordering, index shifting, vector register reuse exploiting, and multi-threading. Experimental results show that our multithreading scheme assists vector compiler of Convex C38 series to reduce the number of memory accesses and the number of synchronizations among CPUs and usually obtains a better performance.","PeriodicalId":154429,"journal":{"name":"Proceedings of 1994 International Conference on Parallel and Distributed Systems","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123745524","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Effective load balancing on highly parallel multicomputers based on superconcentrators","authors":"G. Jan, Ming-Bo Lin","doi":"10.1109/ICPADS.1994.590133","DOIUrl":"https://doi.org/10.1109/ICPADS.1994.590133","url":null,"abstract":"Tree and mesh architectures have been considered as two of the most highly scalable parallel multicomputers due to their scalabilities which are superior to those of hypercubes. However, the load balancing on these two multicomputer systems are not as well as we expected. The worst case of tree architecture requires O(M/spl times/p/spl times/logp) routing time for redistributing the workload over the system and it requires O(M/spl times//spl radic/p) for mesh architecture while pipelined packet routing scheme is used. In this paper, we propose an approach based on superconcentrators to reduce the above bounds to O(Mlogp) for both cases with only additional O(p) cost. Furthermore, by using this scheme, the underlying systems can leave the load balancing problem entirely to the superconcentrator so that there does not arise any additional workload of the systems. In addition, this scheme also adds extra communicating paths to the processors so that it not only increases the communication capacity among the processors but also could tolerate edge faults of the systems.","PeriodicalId":154429,"journal":{"name":"Proceedings of 1994 International Conference on Parallel and Distributed Systems","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127272879","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Stochastic modeling of scaled parallel programs","authors":"A. Malony, V. Mertsiotakis, Andreas Quick","doi":"10.1109/ICPADS.1994.590308","DOIUrl":"https://doi.org/10.1109/ICPADS.1994.590308","url":null,"abstract":"Testing the performance scalability of parallel programs can be a time consuming task, involving many performance runs for different computer configurations, processor numbers, and problem sizes. Ideally, scalability issues would be addressed during parallel program design, but tools are not presently available that allow program developers to study the impact of algorithmic choices under different problem and system scenarios. Hence, scalability analysis is often reserved to existing (and available) parallel machines as well as implemented algorithms. In this paper we propose techniques for analyzing scaled parallel programs using stochastic modeling approaches. Although allowing more generality and flexibility in analysis, stochastic modeling of large parallel","PeriodicalId":154429,"journal":{"name":"Proceedings of 1994 International Conference on Parallel and Distributed Systems","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130432710","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Extending Vienna Fortran with task parallelism","authors":"B. Chapman, P. Mehrotra, J. Rosendale, H. Zima","doi":"10.1109/ICPADS.1994.590306","DOIUrl":"https://doi.org/10.1109/ICPADS.1994.590306","url":null,"abstract":"Vienna Fortran supports a wide range of data-parallel numerical problems. However, a significant number of scientific and engineering applications are of a multi-disciplinary and heterogeneous nature and thus do not fit well into the data parallel paradigm. In this paper we present new language extensions to fill this gap. Tasks can be spawned as asynchronous activities in a homogeneous or heterogeneous computing environment; they interact by sharing access to Shared Data Abstractions (SDAs). SDAs are an extension of Fortran 90 modules, representing a pool of common data, together with a set of methods for controlled access to these data and a mechanism for providing persistent storage. These extensions support the integration of data and task parallelism and can be used to express task parallel applications in a natural and efficient way.","PeriodicalId":154429,"journal":{"name":"Proceedings of 1994 International Conference on Parallel and Distributed Systems","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132935821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Toward semantic-based parallelism in production systems","authors":"Shiow-yang Wu, Daniel P. Miranker, J. Browne","doi":"10.1109/ICPADS.1994.590466","DOIUrl":"https://doi.org/10.1109/ICPADS.1994.590466","url":null,"abstract":"We propose a new approach for the parallel execution of production system programs. This approach embodies methods of decomposition abstraction using declarative mechanisms. Application semantics can then be exploited to achieve a much higher degree of concurrency. We present the underlying object-based framework of production systems and discuss the ensuing semantic-based dependency analysis technique. In particular, we define a new notion of functional dependency to characterize associative relationships among data objects, which can be used to determine concurrently executable rules.","PeriodicalId":154429,"journal":{"name":"Proceedings of 1994 International Conference on Parallel and Distributed Systems","volume":"234 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133362556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A fast switching double processing architecture for multi-tasking real-time systems","authors":"Tein-Hsiang Lin, Jui-ping Liao","doi":"10.1109/ICPADS.1994.589904","DOIUrl":"https://doi.org/10.1109/ICPADS.1994.589904","url":null,"abstract":"A new fast switching double processing architecture for pipelined cache-based real-time computer systems is proposed to reduce the CPU stalls due to increased cache misses resulting from frequent task switching in multi-tasking real-time applications. In this architecture, two sets of registers are provided so that two tasks can be executed alternatively on a cycle-by-cycle basis. This architecture helps alleviate the problem of unpredictable cache performance due to frequent context switches in multi-tasking systems. The performance of the double processing is evaluated first through trace driven simulation for various cache configurations. An analytical performance model is then derived to further explain the performance advantage.","PeriodicalId":154429,"journal":{"name":"Proceedings of 1994 International Conference on Parallel and Distributed Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129374627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Using the imprecise-computation technique for congestion control on a real-time traffic switching element","authors":"V. Millan-Lopez, W. Feng, J. Liu","doi":"10.1109/ICPADS.1994.590126","DOIUrl":"https://doi.org/10.1109/ICPADS.1994.590126","url":null,"abstract":"The broadband integrated services digital network provides communication services with different requirements, including real-time services such as voice and video. Real-time services are affected by the probabilistic behavior of such a network. In particular, when the network becomes congested, the end-to-end packet delay may exceed the maximum allowed. Fortunately, many real-time services are willing to trade service quality for information timeliness. The imprecise-computation technique, in combination with layered coding schemes, makes this tradeoff possible.","PeriodicalId":154429,"journal":{"name":"Proceedings of 1994 International Conference on Parallel and Distributed Systems","volume":"61 5-6","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132285974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Parallel block generalized WZ factorization","authors":"A. Benaini, David Laiymani","doi":"10.1109/ICPADS.1994.590084","DOIUrl":"https://doi.org/10.1109/ICPADS.1994.590084","url":null,"abstract":"In this paper we first present a block strategy for the generalized WZ factorization, which consists of block factorizing a matrix A in the form A=WZW/sup -1/. This study shows how a block strategy may be used to reduce a large eigenvalue problem into a number of smaller ones. Next, we develop a parallel multi-phase algorithm for this method, which requires processes such as matrices products, Gauss-Jordan elimination, broadcasting, scattering and gathering. To conceive our multi-phase algorithm we have used an informal methodology like the sequential top-down analysis which allows the conception of efficient multiphase parallel algorithms. The experimental tests show a good speed-up and corroborate the theoretical valuations.","PeriodicalId":154429,"journal":{"name":"Proceedings of 1994 International Conference on Parallel and Distributed Systems","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131068082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}