P. D'Ambra, Fabio Durastante, S. Ferdous, S. Filippone, M. Halappanavar, A. Pothen
{"title":"AMG Preconditioners based on Parallel Hybrid Coarsening and Multi-objective Graph Matching","authors":"P. D'Ambra, Fabio Durastante, S. Ferdous, S. Filippone, M. Halappanavar, A. Pothen","doi":"10.1109/PDP59025.2023.00017","DOIUrl":"https://doi.org/10.1109/PDP59025.2023.00017","url":null,"abstract":"We describe preliminary results from a multi-objective graph matching algorithm, in the coarsening step of an aggregation-based Algebraic MultiGrid (AMG) preconditioner, for solving large and sparse linear systems of equations on high-end parallel computers. We have two objectives. First, we wish to improve the convergence behavior of the AMG method when applied to highly anisotropic problems. Second, we wish to extend the parallel package PSCToolkit to exploit multi-threaded parallelism at the node level on multi-core processors. Our matching proposal balances the need to simultaneously compute high weights and large cardinalities by a new formulation of the weighted matching problem combining both these objectives using a parameter $lambda$. We compute the matching by a parallel $2/3-varepsilon$-approximation algorithm for maximum weight matchings. Results with the new matching algorithm show that for a suitable choice of the parameter $lambda$ we compute effective preconditioners in the presence of anisotropy, i.e., smaller solve times, setup times, iterations counts, and operator complexity.","PeriodicalId":153500,"journal":{"name":"2023 31st Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114852655","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
R. Montella, D. Di Luccio, Ciro Giuseppe de Vita, Gennaro Mellone, M. Lapegna, Gloria Ortega, L. Marcellino, E. Zambianchi, G. Giunta
{"title":"A highly scalable high-performance Lagrangian transport and diffusion model for marine pollutants assessment","authors":"R. Montella, D. Di Luccio, Ciro Giuseppe de Vita, Gennaro Mellone, M. Lapegna, Gloria Ortega, L. Marcellino, E. Zambianchi, G. Giunta","doi":"10.1109/PDP59025.2023.00012","DOIUrl":"https://doi.org/10.1109/PDP59025.2023.00012","url":null,"abstract":"While using High-Performance Computing (HPC) for precise and accurate air quality forecasts is a common issue, similar services devoted to marine pollution in coastal areas remain challenging. This paper presents Water quality Community Model Plus Plus (WaComM++) leveraging a parallelization schema enabling the users to run it on heterogeneous parallel architectures. We evaluated the proposed model under several execution approaches using a real-world application for pollutants forecast in the Gulf of Napoli (Campania, Italy). As a result, WaComM++ has produced results 657K times faster than the sequential run (taking into account the Particles' Outer Cycle and not considering the particle domain distribution) when using distributed and shared memory with multi-GPUs dealing with about 25 million particles.","PeriodicalId":153500,"journal":{"name":"2023 31st Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114377035","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. Villarrubia, Luis Costero, Francisco D. Igual, Katzalin Olcoz
{"title":"Improving inference time in multi-TPU systems with profiled model segmentation","authors":"J. Villarrubia, Luis Costero, Francisco D. Igual, Katzalin Olcoz","doi":"10.1109/PDP59025.2023.00020","DOIUrl":"https://doi.org/10.1109/PDP59025.2023.00020","url":null,"abstract":"In this paper, we systematically evaluate the inference performance of the Edge TPU by Google for neural networks with different characteristics. Specifically, we determine that, given the limited amount of on-chip memory on the Edge TPU, accesses to external (host) memory rapidly become an important performance bottleneck. We demonstrate how multiple devices can be jointly used to alleviate the bottleneck introduced by accessing the host memory. We propose a solution combining model segmentation and pipelining on up to four TPUs, with remarkable performance improvements that range from 6x for neural networks with convolutional layers to 46x for fully connected layers, compared with single-TPU setups.","PeriodicalId":153500,"journal":{"name":"2023 31st Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129345293","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Parallel Directives Evaluation in Porous Media Application: A Case Study","authors":"Natiele Lucca, C. Schepke, Gabriel Tremarin","doi":"10.1109/PDP59025.2023.00050","DOIUrl":"https://doi.org/10.1109/PDP59025.2023.00050","url":null,"abstract":"High-performance computing provides the acceleration of scientific applications through the use of parallelism. Applications of this type usually demand a lot of computation time for a version with a single code execution stream. The adoption of different models of parallel programming enables the development of concurrent code. In this sense, this paper evaluates parallel interfaces and their programming models. Therefore, as a case study, we evaluate a porous media application that simulates grain drying using OpenMP (loop, sections, tasks, target, and teams approach) and OpenACC programming interfaces. The results show a reduction in processing time in all test cases. The total parallel simulation time for a multicore architecture using 16 physical cores was 5.61 times less using loops, 5.96 using targets, and 7.50 using teams. Task and section directives produce around 1.20 speedup due to the limitations of concurrent task executions of the application. The reduction using a single GPU was 7.54. We also contribute with some collected traces, identifying the parallel steps and synchronization time.","PeriodicalId":153500,"journal":{"name":"2023 31st Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127872571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Francesco Martella, Valeria Lukaj, M. Fazio, A. Celesti, M. Villari
{"title":"On-Demand and Automatic Deployment of Microservice at the Edge Based on NGSI-LD","authors":"Francesco Martella, Valeria Lukaj, M. Fazio, A. Celesti, M. Villari","doi":"10.1109/PDP59025.2023.00055","DOIUrl":"https://doi.org/10.1109/PDP59025.2023.00055","url":null,"abstract":"This paper focuses on a new approach to conceiving “virtual sensors” operating in smart environments, which are abstracted components able to map different behaviours on the same Internet of Things (IoT)-based infrastructures according to the needs of the high-level applications. To realize “virtual sensors”, it is necessary to codify user requests in an automation process for the deployment at the Edge of the microservices (MSs) that satisfy such requests. We present a solution that implements all the necessary functionalities to bind the user application with the Edge device in charge to execute the “virtual sensors”. In particular, the solution we propose is based on the FIWARE NGSI-LD information model, which helps us to standardize the communication among the different entities involved in the process. Moreover, the paper describes the reference architecture we designed, provides the implementation details of our first prototype and reports the results of our evaluation experiments.","PeriodicalId":153500,"journal":{"name":"2023 31st Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121048463","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Intrusion Detection Systems for Cyber Attacks Detection in Power Line Communications Networks","authors":"K. Qureshi, N. Arshad, T. Newe","doi":"10.1109/PDP59025.2023.00038","DOIUrl":"https://doi.org/10.1109/PDP59025.2023.00038","url":null,"abstract":"Power Line Communication (PLC) is categorized into wired and wireless technologies to distribute the power and transmit the data at different frequency ranges. System administration is one of the significant area in these networks to manage communication processes. Security is one of the significant concern which make networks slow and unavailable, false and altered instructions exist, malfunctioning, and abnormal behavior of systems observed. Intrusion Detection System (IDS) is one of the solution to handle security attacks and protect the systems from unauthorized access. However, the existing IDS systems have limited capabilities to handle the new attacks. This paper proposes a Machine Learning (ML) algorithm for IDS system used in PLC networks to improve the overall system performance and detect the vulnerabilities of the system. The proposed system can detect the latest assaults and protect the systems from unauthorized and malicious activities. The proposed IDS system is assessed by using a virtual environment using the latest dataset and compared with existing traditional systems. The experiment results indicated the better performance of the proposed system to handle the new assaults and protect the systems.","PeriodicalId":153500,"journal":{"name":"2023 31st Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115671256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Garcia, Dalvan Griebler, C. Schepke, André Sacilotto Santos, José Daniel García Sánchez, Javier Fernández Muñoz, L. G. Fernandes
{"title":"A Latency, Throughput, and Programmability Perspective of GrPPI for Streaming on Multi-cores","authors":"A. Garcia, Dalvan Griebler, C. Schepke, André Sacilotto Santos, José Daniel García Sánchez, Javier Fernández Muñoz, L. G. Fernandes","doi":"10.1109/PDP59025.2023.00033","DOIUrl":"https://doi.org/10.1109/PDP59025.2023.00033","url":null,"abstract":"Several solutions aim to simplify the burdening task of parallel programming. The GrPPI library is one of them. It allows users to implement parallel code for multiple backends through a unified, abstract, and generic layer while promising minimal overhead on performance. An outspread evaluation of GrPPI regarding stream parallelism with representative metrics for this domain, such as throughput and latency, was not yet done. In this work, we evaluate GrPPI focused on stream processing. We evaluate performance, memory usage, and programming effort and compare them against handwritten parallel code. For this, we use the benchmarking framework SPBench to build custom GrPPI benchmarks. The basis of the benchmarks is real applications, such as Lane Detection, Bzip2, Face Recognizer, and Ferret. Experiments show that while performance is competitive with handwritten code in some cases, in other cases, the infeasibility of fine-tuning GrPPI is a crucial drawback. Despite this, programmability experiments estimate that GrPPI has the potential to reduce by about three times the development time of parallel applications.","PeriodicalId":153500,"journal":{"name":"2023 31st Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130926630","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
G. Coviello, Kunal Rao, Ciro Giuseppe De Vita, Gennaro Mellone, Priscilla Benedetti, S. Chakradhar
{"title":"Content-aware auto-scaling of stream processing applications on container orchestration platforms","authors":"G. Coviello, Kunal Rao, Ciro Giuseppe De Vita, Gennaro Mellone, Priscilla Benedetti, S. Chakradhar","doi":"10.1109/PDP59025.2023.00025","DOIUrl":"https://doi.org/10.1109/PDP59025.2023.00025","url":null,"abstract":"Modern applications are designed as an interacting set of microservices, and these applications are typically deployed on container orchestration platforms like Kubernetes. Several attractive features in Kubernetes make it a popular choice for deploying applications, and automatic scaling is one such feature. The default horizontal scaling technique in Kubernetes is the Horizontal Pod Autoscaler (HPA). It scales each microservice independently while ignoring the interactions among the microservices in an application. In this paper, we show that ignoring such interactions by HPA leads to inefficient scaling, and the optimal scaling of different microservices in the application varies as the stream content changes. To automatically adapt to variations in stream content, we present a novel system called DataX AutoScaler that leverages knowledge of the entire stream processing application pipeline to efficiently auto-scale different microservices by taking into account their complex interactions. Through experiments on real-world video analytics applications, such as face recognition and pose classification, we show that DataX AutoScaler adapts to variations in stream content and achieves up to 43% improvement in overall application performance compared to a baseline system that uses HPA.","PeriodicalId":153500,"journal":{"name":"2023 31st Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)","volume":"13 10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134110201","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"HTPS: Heterogeneous Transferring Prediction System for Healthcare Datasets","authors":"Jia-Hao Syu, Chun-Wei Lin, M. Fojcik, Rafał Cupek","doi":"10.1109/PDP59025.2023.00039","DOIUrl":"https://doi.org/10.1109/PDP59025.2023.00039","url":null,"abstract":"Medical internet of things leads to revolutionary improvements in medical services, also known as smart healthcare. With the big healthcare data, data mining and machine learning can assist wellness management and intelligent diagnosis, and achieve the P4-medicine. However, healthcare data has high sparsity and heterogeneity. In this paper, we propose a Heterogeneous Transferring Prediction System (HTPS). Feature engineering mechanism transforms the dataset into sparse and dense feature matrices, and autoencoders in the embedding networks not only embed features but also transfer knowledge from heterogeneous datasets. Experimental results show that the proposed HTPS outperforms the benchmark systems on various prediction tasks and datasets, and ablation studies present the effectiveness of each designed mechanism. Experimental results demonstrate the negative impact of heterogeneous data on benchmark systems and the high transferability of the proposed HTPS.","PeriodicalId":153500,"journal":{"name":"2023 31st Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123352323","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gennaro Mellone, Ciro Giuseppe de Vita, Dante D. Sánchez-Gallegos, D. Di Luccio, G. Mattei, Francesco Peluso, Pietro Patrizio Ciro Aucelli, A. Ciaramella, R. Montella
{"title":"A containerized distributed processing platform for autonomous surface vehicles: preliminary results for marine litter detection","authors":"Gennaro Mellone, Ciro Giuseppe de Vita, Dante D. Sánchez-Gallegos, D. Di Luccio, G. Mattei, Francesco Peluso, Pietro Patrizio Ciro Aucelli, A. Ciaramella, R. Montella","doi":"10.1109/PDP59025.2023.00029","DOIUrl":"https://doi.org/10.1109/PDP59025.2023.00029","url":null,"abstract":"Autonomous Surface Vehicles and their management represent one of the significant challenges in coastal and offshore surveying. Although the development of this kind of data acquisition device has skyrocketed in the last few years, line guides and technological solutions still need to come. On the other hand, this kind of robotic vessel's true potential has yet to be explored. This paper presents ArgonautAI, a containerized distributed processing platform for autonomous surface vehicles. The proposed ArgonautAI architecture leverage a cluster of single-board computers with diverse and different characteristics (computing power, CUDA GPUs, FPGAs, GPIOs, PWMs, specialized I/O) orchestrated using Kubernetes and a customized programming interface. Furthermore, the proposed solution introduces two different types of containers: 1) the platform containers hosting the software life support for the platform and 2) the mission containers defined to support the survey mission-specific scopes. The firsts manage the vehicle's instruments (e.g. position, attitude, environment, depth), the data storage, the vessel-to-shore communication, and so on; the latter host mission-specific software components. Finally, as proof of concept of the proposed platform, we present an AI-based marine litter detection application using a hierarchical computer vision approach on heterogenic onboard computing resources.","PeriodicalId":153500,"journal":{"name":"2023 31st Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)","volume":"337 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130871524","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}