{"title":"A Distributed Genetic Algorithm with Adaptive Diversity Maintenance for Ordered Problems","authors":"Ryoma J. Ohira, Md. Saiful Islam","doi":"10.1109/PDCAT46702.2019.00063","DOIUrl":"https://doi.org/10.1109/PDCAT46702.2019.00063","url":null,"abstract":"Maintaining population diversity is critical to the performance of a Genetic Algorithm (GA). Applying appropriate strategies for measuring population diversity is important in order to ensure that the mechanisms for controlling population diversity are provided with accurate feedback. Sequence-wise approaches to measuring population diversity have demonstrated their effectiveness in assisting with maintaining population diversity for ordered problems, however these processes increase the computational costs for solving ordered problems. Research in distributed GAs have demonstrated how applying different distribution models can affect an GA's ability to scale and effectively search the solution space. This paper proposes a distributed GA with adaptive parameter controls for solving ordered problems such as the travelling salesman problem(TSP), capacitated vehicle routing problem (CVRP) and the job-shop scheduling problem (JSSP). Extensive experimental results demonstrate the superiority of the proposed approach.","PeriodicalId":166126,"journal":{"name":"2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130795695","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
D. D’Auria, Masahiro Hayakawa, Sheida Malekpour, Stephan Matzka, M. Moreno, Aleksander Slominski, Atsushi Kitazawa
{"title":"Message from the General Chairs","authors":"D. D’Auria, Masahiro Hayakawa, Sheida Malekpour, Stephan Matzka, M. Moreno, Aleksander Slominski, Atsushi Kitazawa","doi":"10.1109/pdcat46702.2019.00005","DOIUrl":"https://doi.org/10.1109/pdcat46702.2019.00005","url":null,"abstract":"Artificial Intelligence (AI) is concerned with computing technologies that allow machines to see, hear, talk, think, learn, and solve problems even above the level of human beings. On the one hand it allows business decisions to be driven by real-time models that enable unprecedented levels of accuracy and efficiency. On the other hand it enables general and professional problem solving and knowledge discovery that cannot be easily done by humans. In addition, business-business, business-customer, and customer-customer may be interconnected in a revolutionary way to support new business models with elevated customer experiences.","PeriodicalId":166126,"journal":{"name":"2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129390473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Machine Learning Based Performance Analysis and Prediction of Jobs on a HPC Cluster","authors":"Zhengxiong Hou, Shuxin Zhao, Chao Yin, Yunlan Wang, Jianhua Gu, Xingshe Zhou","doi":"10.1109/PDCAT46702.2019.00053","DOIUrl":"https://doi.org/10.1109/PDCAT46702.2019.00053","url":null,"abstract":"There are a lot of middle-class or small-class high-performance computing clusters at universities and research institutes, etc. Large volumes of job logs have been accumulated after many years of operation. In this paper, on the basis of accumulated job logs on a high-performance computing cluster, we examine and analyze the job logs. Then, we study machine learning based performance analysis and prediction methods for parallel jobs. Various machine learning methods such as multivariate linear fitting, artificial neural network are used to build performance prediction models. We compare the errors of each model, and select the optimal prediction model for different users. The experimental results show that we can obtain reasonable prediction accuracy using the selected machine learning algorithms.","PeriodicalId":166126,"journal":{"name":"2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121917200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cailen Robertson, Jia Li, Ryoma J. Ohira, Quoc Viet Hung Nguyen, Jun Jo
{"title":"Optimising Deep Learning Split Deployment for IoT Edge Networks","authors":"Cailen Robertson, Jia Li, Ryoma J. Ohira, Quoc Viet Hung Nguyen, Jun Jo","doi":"10.1109/PDCAT46702.2019.00069","DOIUrl":"https://doi.org/10.1109/PDCAT46702.2019.00069","url":null,"abstract":"The Internet of Things (IoT) often generates large volumes of messy data which are difficult to process efficiently. While deep learning models have demonstrated their suitability in processing this data, the memory and processing requirements makes it difficult to deploy on edge nodes while achieving viable throughput results. Current solutions involve deploying the model in the cloud, but this leads to increased network costs due to the transfer of raw data. However, the layer based design of deep learning models allows for a model to be split into sub-models and deployed separately across IoT nodes. By deploying parts of the model on the edge node and in the cloud, the edge node is able to transmit an intermediate layer's feature output to the following sub-model instead of the raw input data. This reduces the size of the data being transmitted and results in a lower cost to the network. However, selecting the best layer to split the model becomes a multi-objective optimisation problem. In this paper, we propose an optimisation method that considers the network cost, input rate and processing overhead in selecting the best layer for splitting a model across an IoT network. We profile several popular model architectures to highlight their performance using this split deployment. Results from simulated and physical tests of the optimal layers are provided to demonstrate the method's effectiveness in real-world applications.","PeriodicalId":166126,"journal":{"name":"2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"79 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123940903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient Fault-Tolerant Syndrome Measurement of Quantum Error-Correcting Codes Based on \"Flag\"","authors":"QiFei Wei, Dongxiao Quan, Jing Liu, Changxing Pei","doi":"10.1109/PDCAT46702.2019.00041","DOIUrl":"https://doi.org/10.1109/PDCAT46702.2019.00041","url":null,"abstract":"Fault-tolerant syndrome measurement plays an important role in the process of quantum error correction, and considerable effort had been taken for reducing the physical overhead of syndrome measurement which including the ancilla qubits, the CNOT gates and the time-slots. The two extra qubits syndrome measurement technique, known as \"flag\"-style syndrome measurement, cuts down the number of extra qubits to the utmost. However, it works slowly because it measures only one syndrome at a time. We extend the technique to extract all syndromes of the distance-3 quantum error-correcting code at once. We propose a new method that increases the parallelism of the syndrome measurement circuit and reduces the time overhead by allocating and adjusting the order of CNOT gates for measuring data block reasonably, which we call dynamic time-slot allocation scheme, and which is applicable to both Hamming codes and color codes. For a CSS quantum error-correcting code with the number of stabilizers m and the maximum weight w, we only need 2m extra qubits and 2 × (w+2) time-slots for one-shot measurement of all syndromes.","PeriodicalId":166126,"journal":{"name":"2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114423051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Joint Mobile Data Collection and Energy Supply Scheme for Rechargeable Wireless Sensor Networks","authors":"Zhansheng Chen, Hong Shen","doi":"10.1109/PDCAT46702.2019.00098","DOIUrl":"https://doi.org/10.1109/PDCAT46702.2019.00098","url":null,"abstract":"To alleviate energy hole problem inherent in static Base station (BS) scheme and extend network work of battery-restricted wireless rechargable sensor networks, a joint mobile data col-lection and adaptive charging scheduling (MDC-ACS) scheme based on virtual grid is proposed in this paper, aiming to achieve whole network energy balance and high charging effi-ciency. In MDC-ACS scheme, network is firstly divided into several grids and several rendezvous are determined using geo-graphic information. Then, mobile data collector (MDC) moves in grids for data collection within the application delay and moving trajectory is guided by rendezvous. Due to the limited battery capacity of mobile wirelsee charging vehicle (WCV) and its energy consumption while cruising, an adaptive charging schedule scheme is proposed for maximizing recharging benefit. With extensive simulation, we demonstrate that MDC-ACS scheme can achieve better charging benefit and reduce average charging delay.","PeriodicalId":166126,"journal":{"name":"2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132414641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Reachability in Multithreaded Programs Is Polynomial in the Number of Threads","authors":"A. Malkis","doi":"10.1109/PDCAT46702.2019.00078","DOIUrl":"https://doi.org/10.1109/PDCAT46702.2019.00078","url":null,"abstract":"Reachability in multithreaded programs is an important yet inherently difficult problem, even if they are finite-state and equipped with the interleaving semantics. So far, the complexity of this problem in the number of threads n, while keeping the maximal size of the thread-local memory and the size of shared memory bounded by a constant, has been explored poorly. We close this gap by measuring aspects such as (i) the diameter, i.e., the longest finite distance realizable in the transition graph of the program, (ii) the local diameter, i.e., the maximum distance from any program state to any thread-local state, and (iii) the computational complexity of bug-finding. We prove that all these are majorized by a polynomial in n and, in certain cases, by a linear, logarithmic, or even constant function in n. Such bounds shed more light on how the widely expressed claim, that one of the major obstacles to analyzing concurrent programs is the exponential state explosion in the number of threads, should (and should not) be understood.","PeriodicalId":166126,"journal":{"name":"2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"04 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129205544","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Uwe Jahn, V. Poliakov, Meghadoot Gardi, Peter Schulz, Carsten Wolff
{"title":"Introducing PulseAT: A Tool for Analyzing System Utilization in Distributed Systems","authors":"Uwe Jahn, V. Poliakov, Meghadoot Gardi, Peter Schulz, Carsten Wolff","doi":"10.1109/PDCAT46702.2019.00057","DOIUrl":"https://doi.org/10.1109/PDCAT46702.2019.00057","url":null,"abstract":"For the development and maintenance of distributed systems, it is useful to analyze the system condition and utilization for each hardware component. With pulseAT, a tool has been developed which collects that system utilization systematically with lightweight pulseAT Agents. The hierarchical structure of pulseAT allows having all system utilization data at one place on a pulseAT Manager to show an overall current health condition of the system. A cloud-based pulseAT Analyzer stores the data into a time-based database to support long-term analyses and to process analyzing algorithms, e.g., to forecast future health conditions. This paper describes the structure of pulseAT, main concepts, e.g., how the response time for each component is calculated. Some technical details of the implementation are shown. Finally, it describes how pulseAT has been tested on a mobile robot, the DAEbot.","PeriodicalId":166126,"journal":{"name":"2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124544279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Haixin Du, Jiankui Zhang, Shihao Sha, Cai Ye, Qiuming Luo
{"title":"The Library for Hadoop Deflate Compression Based on FPGA Accelerator with Load Balance","authors":"Haixin Du, Jiankui Zhang, Shihao Sha, Cai Ye, Qiuming Luo","doi":"10.1109/PDCAT46702.2019.00056","DOIUrl":"https://doi.org/10.1109/PDCAT46702.2019.00056","url":null,"abstract":"Hadoop application will produce lots of intermediate results in the map/reduce process that requires disk I/O and network transmission. By compressing the large-scale data of intermediate result, it will greatly improve disk access efficiently and reduce program run time. Hardware-accelerated solutions have become more desirable. This paper design a multi-FPGA compression accelerator on the Hadoop platform, and the system performance analysis compared with a software-only solution that mainly uses CPU to processing. The testing programs are zpipe, TestDFSIO and Terasort. In contrast with the software-only solution. The max speedup of zpipe is 6.55X (single FPGA) and 10.24X (dual FPGA), the max speedup of TestDFSIO is 6.28X (single FPGA) and 6.28X (dual FPGA), and the max speedup of Terasort application is up to 3.25X(single FPGA) and 3.35X(dual FPGA).","PeriodicalId":166126,"journal":{"name":"2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121199772","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Concurrent Failure Recovery for Product Matrix Regenerating Code","authors":"Jingyao Zhang","doi":"10.1109/PDCAT46702.2019.00060","DOIUrl":"https://doi.org/10.1109/PDCAT46702.2019.00060","url":null,"abstract":"Regenerating codes can minimize the network bandwidth required to recover the lost data in case of node failure in distributed storage systems. Product Matrix (PM) code is an important kind of Minimum Storage Regenerating (MSR) code that can maximize the storage efficiency, meanwhile minimizing the repair bandwidth. The original Product Matrix (PM) code only addressed single node failure. In this work, we will propose an algorithm of recovering multiple failed nodes concurrently for PM code. The explicit construction of the Repair Matrix that is applicable to any reasonable combinations of coding parameters will be presented, and the lost data can be obtained by simply multiplying the helper data with the repair matrix, thus is very easy for implementation. Based on the proposed strategy, the needed bandwidth for two major repairing policies: centralized and distributed recovery will be given formally. Moreover, the impact of Repairing Degree (the number of surviving nodes from which the assistant data are downloaded) on the bandwidth cost will be studied, which can help make optimal decisions in practical storage systems.","PeriodicalId":166126,"journal":{"name":"2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128177503","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}