{"title":"Effect of Parallel Processing by Duplicating Histogram in Automatic Image Binarization for High-Level Synthesis","authors":"Moena Yamasaki, A. Yamawaki","doi":"10.1109/PDCAT46702.2019.00105","DOIUrl":"https://doi.org/10.1109/PDCAT46702.2019.00105","url":null,"abstract":"The high-level synthesis converting software to hardware automatically is one of the important technologies for significantly reducing the burden caused by developing hardware module. The embedded image processing products need the HLS to quickly implement a hardware module achieving high-performance with low-power consumption instead of software implementation for high-computational processing. However, the HLS tool cannot generate a hardware module with high-performance and low-power consumption expected when the software program without deep consideration about the hardware organization to be generated is input. This paper shows a software description method to extract some parallelisms by duplicating the histograms in the automatic image binarization which is called as Otsu's method in order to improve a performance of the hardware module HLS generates. The experimental results show the effect of our description method about the performance and the amount of hardware. We also discuss about a trade-off among the parallelism, performance and amount of hardware.","PeriodicalId":166126,"journal":{"name":"2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131225604","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Automated Segmentation of Substantia Nigra and Red Nucleus in Quantitative Susceptibility Mapping Images","authors":"Dibash Basukala, R. Mukundan, T. Melzer, A. Lim","doi":"10.1109/PDCAT46702.2019.00074","DOIUrl":"https://doi.org/10.1109/PDCAT46702.2019.00074","url":null,"abstract":"Substantia nigra (SN) and red nucleus (RN) located in midbrain are integral in the study of brain disease such as Parkinson's disease (PD). The automatic segmentation of SN and RN in high-resolution quantitative susceptibility mapping (QSM) images can aid in PD characterization and progression. However, only a few methods have been proposed to segment them, owing to the recent development of high quality imaging. Therefore, we describe a novel method for the segmentation of SN and RN in QSM images using contrast enhancement, level set method, wavelet transform and watershed transform. The segmentation performance is evaluated in 20 subjects containing both healthy and PD patients. The results of the proposed segmentation method were closer to the manual segmentation performed by the radiologist than the popular level set methods. The Dice coefficient of the left SN and right SN were 0.77 ± 0.09 and 0.78 ± 0.07 respectively while the Dice for the left RN and right RN were 0.80 ± 0.08 and 0.77 ± 0.08 respectively.","PeriodicalId":166126,"journal":{"name":"2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134156409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Color Distortion Removal for Heart Rate Monitoring in Fitness Scenario","authors":"Quoc-Viet Tran, S. Su, M. Tran","doi":"10.1109/PDCAT46702.2019.00073","DOIUrl":"https://doi.org/10.1109/PDCAT46702.2019.00073","url":null,"abstract":"Heart rate estimation from fitness plays an important role in the evaluation of fitness exercises. Conventional approaches use the photoplethysmography (PPG) sensor to consider the change of light absorption on the wrist skin for heart rate estimation. However, users are required to buy smartwatches for using this function. Various approaches based on video analysis are recently implemented for surveillance purpose. However, it is unstable for motion scenario such as fitness exercises due to the color distortion induced by movement. POS and CHROM are introduced to address this issue. Since the fixed projection planes from POS and CHROM are given in several sources of light, it is not widely applied for surveillance applications. Therefore, a novel projection plane that is adaptively changed with the lighting environment is proposed to estimate the heart rate from fitness videos in ambient light. Moreover, image and digital signal processing techniques are also applied to extract the clean pulse signal from a novel projection plane. From the experiments conducted, the proposed approach outperformed the existing approaches to be the best model for heart rate estimation from fitness videos with the accuracy up to 91.08%.","PeriodicalId":166126,"journal":{"name":"2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131458057","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"FPGA-Based Parallel Multi-Core GZIP Compressor in HDFS","authors":"Haoxin Luo, Ye Cai, Qiuming Luo, Rui Mao","doi":"10.1109/PDCAT46702.2019.00017","DOIUrl":"https://doi.org/10.1109/PDCAT46702.2019.00017","url":null,"abstract":"With the development of Big Data, data storage has been exposed to more challenges. Data compression which can save both storage and network bandwidth, is a very important technology to deal with the challenges. In this paper, we present an end-to-end, complete, high-throughput parallel multi-core GZIP compressor in FPGA for HDFS. The GZIP compressor is designed by the scalable architecture, which supports to increase throughput by expanding multiple compression cores based on systolic array architecture. We implemented and evaluated the hardware compressor in Alpha Data Adm-Pcie-KU3 FPGA board, utilizing RIFFA for data transfers over PCI Express. According to the evaluation results, up to 16-cores compressor can be implemented and the peak compression throughput exceeds 1.1 GB/s. It is 70X speedup compared with the software compression solution. When we load the hardware compressor into HDFS, the performance of HDFS is twice as much as that without loading the compressor.","PeriodicalId":166126,"journal":{"name":"2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126669933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Overlapping Community Detection of Complex Network: A Survey","authors":"Qi Chen, L. Wei","doi":"10.1109/PDCAT46702.2019.00102","DOIUrl":"https://doi.org/10.1109/PDCAT46702.2019.00102","url":null,"abstract":"It is well established that the network is ubiquitous. Social platforms, academic system, and other systems all exist in the form of networks, which often reflect the connections between different individuals in the real world. Effective community detection algorithm can explore the hidden community structure in the network, which has a great positive impact on people's daily life. At present, it has been widely applied in online public opinion monitoring, personalized recommendation, advertising and other fields. As the network structure tends to be complicated, the detection of community structure of complex networks has become a hot topic of current research. This paper reviews the state-of-the-art in the overlapping community detection of complex networks, and briefly summarizes the advantages and applications of each algorithm. Furthermore, the current challenges in overlapping community detection of complex networks are illustrated and some suggestions on future research are proposed.","PeriodicalId":166126,"journal":{"name":"2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126797568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sandra Catalán, X. Martorell, Jesús Labarta, Tetsuzo Usui, Leonel Antonio Toledo Díaz, Pedro Valero-Lara
{"title":"Accelerating Conjugate Gradient using OmpSs","authors":"Sandra Catalán, X. Martorell, Jesús Labarta, Tetsuzo Usui, Leonel Antonio Toledo Díaz, Pedro Valero-Lara","doi":"10.1109/PDCAT46702.2019.00033","DOIUrl":"https://doi.org/10.1109/PDCAT46702.2019.00033","url":null,"abstract":"In this paper, we present the benefits of using the clause concurrent of OmpSs when performing reductions, more specifically, when applied to the dot product (DOT) operations. We analyze its benefits through the implementation of different versions of the Conjugate Gradient (CG) method. We start from a parallel version of the code based on tasks and dependencies; later, we introduce the use of the concurrent clause, which allows to overlap the execution of tasks that have data dependencies among them. In this way, we want to show the benefits of the concurrent clause, which might be included in OpenMP standard as previously done with other OmpSs features. Our tests, performed on a single node of the (Intel-based) Marenostrum 4 Supercomputer and a single socket of the (ARM-based) Dibona cluster, show that the use of the concurrent clause may improve performance with respect to the version where only tasks and dependencies are used around 37% and 23% respectively.","PeriodicalId":166126,"journal":{"name":"2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"56 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134560309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"NFV Optimization Algorithm for Shortest Path and Service Function Assignment","authors":"A. Kalyan","doi":"10.1109/PDCAT46702.2019.00081","DOIUrl":"https://doi.org/10.1109/PDCAT46702.2019.00081","url":null,"abstract":"Our paper focuses on the concept of Network Function Virtualization (NFV): the implementation of requests consisting of various service functions on servers located in data centers. This paper attempts to minimize both the cost of routing and service function assignment of requests from source to destination node on a network. This problem falls under the class of Integer Linear Programming (ILP) which is NP Hard and cannot be solved in polynomial time. Towards developing a solution, we propose to split the problem into two separate optimization subproblems: shortest path routing and service function assignment. We utilize Dijkstra's Shortest Path algorithm and a Greedy method for service function assignment to propose a new heuristic algorithm that minimizes the total cost of routing and service functions assignment. We also analyze the run-time complexity of our proposed heuristic algorithm. Our experimental results suggest that our proposed algorithm matches the optimal ILP solution within acceptable limits.","PeriodicalId":166126,"journal":{"name":"2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114170755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"DOCKERANALYZER : Towards Fine Grained Resource Elasticity for Microservices-Based Applications Deployed with Docker","authors":"M. Fourati, Soumaya Marzouk, K. Drira, M. Jmaiel","doi":"10.1109/PDCAT46702.2019.00049","DOIUrl":"https://doi.org/10.1109/PDCAT46702.2019.00049","url":null,"abstract":"This article deals with anomaly detection for microservices-based applications during elastic treatment. In elastic treatment, scaling-up resources is based on threshold. Many studies consider that threshold exceeding is caused by the increase in requests number. However, this exceeding may be caused by many problems such as specific requests requiring a lot of resources or issues related to VMs and containers. That's why, when thresholds are exceeded we propose to apply an analysis treatment that detects and identifies the root cause of the threshold exceeding, either it's caused by a problem such as specific request, VM issue, container issue or it's caused by a normal increase in request's number. This paper presents \"DOCKERANALYZER\" a software module that detects and identifies execution problems in microservices context. Experimental measurements have been conducted on an IOT platform as a real use-case presenting realistic problems and demonstrating the effectiveness of our proposed solution.","PeriodicalId":166126,"journal":{"name":"2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126179208","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Leonel Toledo, Antonio J. Peña, Sandra Catalán, Pedro Valero-Lara
{"title":"Tasking in Accelerators: Performance Evaluation","authors":"Leonel Toledo, Antonio J. Peña, Sandra Catalán, Pedro Valero-Lara","doi":"10.1109/PDCAT46702.2019.00034","DOIUrl":"https://doi.org/10.1109/PDCAT46702.2019.00034","url":null,"abstract":"In this work, we analyze the implications and results of implementing dynamic parallelism, concurrent kernels and CUDA Graphs to solve task-oriented problems. As a benchmark we propose three different methods for solving DGEMM operation on tiled-matrices; which might be the most popular benchmark for performance analysis. For the algorithms that we study, we present significant differences in terms of data dependencies, synchronization and granularity. The main contribution of this work is determining which of the previous approaches work better for having multiple task running concurrently in a single GPU, as well as stating the main limitations and benefits of every technique. Using dynamic parallelism and CUDA Streams we were able to achieve up to 30% speedups and for CUDA Graph API up to 25x acceleration outperforming state of the art results.","PeriodicalId":166126,"journal":{"name":"2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130598755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiaoying Kong, Gengfa Fang, Li Liu, Tich Phuoc Tran
{"title":"Low Computational Data Fusion Approach Using INS and UWB for UAV Navigation Tasks in GPS-Denied Environments","authors":"Xiaoying Kong, Gengfa Fang, Li Liu, Tich Phuoc Tran","doi":"10.1109/PDCAT46702.2019.00080","DOIUrl":"https://doi.org/10.1109/PDCAT46702.2019.00080","url":null,"abstract":"This paper presents a low computational approach for unmanned aerial vehicles (UAV) navigation in GPS-denied environments. This approach is aiming to reduce computation load for UAV flying mission constraints. Small size, light weight on board hardware are constraints for UAV deployment and flying missions. The on board processor should not be built with high complexity and should consume as little computing as possible. Most existing approaches use Kalman filter, extended Kalman filter, Unscented filter, or particle filter to fuse different types of onboard sensor data to estimate UAV position. We developed a data fusion architecture that does not use these filters. We use an ultra-light-coupling fusion architecture. In this architecture, primary sensor and secondary sensor data are fused. When the secondary sensor is unavailable in most of the time, the UAV navigation uses the output of the primary sensor. When the secondary sensor signal is available, the primary sensor is re-aligned using the secondary sensor signal to bond the errors. In our approach, the primary sensor is Inertial Measurement Unit (IMU), and the secondary sensor inputs are from Ultra-wideband system (UWB). This approach is validated using demonstration of comparison of computing load, and simulation results for accuracy and reliability testing using UAV flying mission scenario.","PeriodicalId":166126,"journal":{"name":"2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123351565","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}