Cristhian Martinez Rendon, J. L. González-Compeán, Dante D. Sánchez-Gallegos, J. Carretero
{"title":"Blockchain-based schemes for continuous verifiability and traceability of IoT data","authors":"Cristhian Martinez Rendon, J. L. González-Compeán, Dante D. Sánchez-Gallegos, J. Carretero","doi":"10.1109/PDP59025.2023.00034","DOIUrl":"https://doi.org/10.1109/PDP59025.2023.00034","url":null,"abstract":"This paper presents a continuous delivery/continuous verifiability (CD/CV) framework for IoT dataflows in edge-fog-cloud. In this framework a CD model based on extraction, transformation, and load (ETL) mechanism as well as a directed acyclic graph (DAG) construction, enable end-users to create efficient schemes for the continuous verification and validation of the execution of applications in edge-fog-cloud infrastructures. This framework also provides tools for continuous verification and validation (CV) of predefined execution sequences and the integrity of digital assets using blockchain. CV model converts ETL and DAG into business model, smart contracts in a private blockchain for the automatic and transparent registration of transactions performed by each application in workflows/pipelines created by CD model without altering applications nor edge-fog-cloud workflows. This framework ensures that IoT dataflow delivers verifiable information for organizations to conduct critical decision-making processes with certainty. A containerized parallelism approach solves portability issues and reduces/compensates the overhead produced by CD/CV operations. The talk will also present evaluation results of the CD/CV framework based on a case study where user mobility information is used to identify interest points, patterns, and maps. The experimental evaluation results show the feasibility of CD/CV to register transactions performed in IoT dataflows through edge-fog-cloud in a private blockchain network.","PeriodicalId":153500,"journal":{"name":"2023 31st Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130397587","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Aymar Cublier Martínez, Álejandro Alvarez Isabel, Jesús Carretero, D. E. Singh
{"title":"Fine-grained parallel social modelling for analyzing COVID-19 propagation","authors":"Aymar Cublier Martínez, Álejandro Alvarez Isabel, Jesús Carretero, D. E. Singh","doi":"10.1109/PDP59025.2023.00026","DOIUrl":"https://doi.org/10.1109/PDP59025.2023.00026","url":null,"abstract":"Agent-based epidemiological simulators have been proven to be one of the most successful tools for the analysis of the COVID-19 propagation. The ability of these tools to reproduce the behavior and interactions of each single individual leads to accurate and detailed results, that can be used to model fine-grained health-related policies like selective vaccination campaigns or immunity waning. One characteristic of these tools is the large amount of input data and computational resources and that they require. This relies on the development of parallel algorithms and methodologies for generating, accessing and processing large volumes of data from multiple data sources. This work presents a parallel workflow for extending the social modelling of EpiGraph, an agent-based simulator. We have included two novel parallel social generation stages -that provide detailed and realistic social model- and one new visualization stage. The work presents a description of the algorithms used in each stage and a practical evaluation on a real platform. Results show that this contribution can be efficiently executed in parallel architectures and increases the simulation detail level, representing a significant advance in the simulator scenario modelling.","PeriodicalId":153500,"journal":{"name":"2023 31st Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)","volume":"224 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126124881","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
B. Galuzzi, Stefano Izzo, F. Giampaolo, Salvatore Cuomo, Marco E. Vanoni, L. Alberghina, C. Damiani, F. Piccialli
{"title":"Coupling constrained-based flux sampling and clustering to tackle cancer metabolic heterogeneity","authors":"B. Galuzzi, Stefano Izzo, F. Giampaolo, Salvatore Cuomo, Marco E. Vanoni, L. Alberghina, C. Damiani, F. Piccialli","doi":"10.1109/PDP59025.2023.00037","DOIUrl":"https://doi.org/10.1109/PDP59025.2023.00037","url":null,"abstract":"Characterizing the heterogeneity of cancer metabolism requires the knowledge of metabolic fluxes in different tumor types. These fluxes cannot be directly determined, especially at a sub-cellular level. Still, they can be obtained numerically through constraint-based steady-state models after integrating other high-throughput -omics data, such as transcriptomics. In this work, we proposed to study cancer metabolism through data analysis and machine learning methodologies. To this aim, we considered transcriptomics profiles for a large set of cancer cells. Using a core metabolic network as a scaffold, we generated many feasible flux distributions for each cancer cell. Then, we used cluster analysis to analyze these data. This preliminary analysis revealed three well-separated clusters having different metabolic behaviors.","PeriodicalId":153500,"journal":{"name":"2023 31st Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133074101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ciro Giuseppe de Vita, Gennaro Mellone, Aniello Florio, Catherine Alessandra Torres Charles, D. Di Luccio, M. Lapegna, G. Benassai, G. Budillon, R. Montella
{"title":"Parallel and hierarchically-distributed Shoreline Alert Model (SAM)","authors":"Ciro Giuseppe de Vita, Gennaro Mellone, Aniello Florio, Catherine Alessandra Torres Charles, D. Di Luccio, M. Lapegna, G. Benassai, G. Budillon, R. Montella","doi":"10.1109/PDP59025.2023.00024","DOIUrl":"https://doi.org/10.1109/PDP59025.2023.00024","url":null,"abstract":"In this paper, the Shoreline Alert Model (SAM) is presented as a component of a computation platform based on workflows dedicated to extreme weather/marine event simulation. The model aims to mitigate the effects of global change by providing decision-makers, scientists, and engineers with a novel, next-generation tool set for facing extreme weather events and implementing related management or emergency responses. SAM uses a parallelization schema, allowing users to run it on heterogeneous parallel architectures. As a result, SAM produces approximately 24 times faster results than the baseline when using shared memory with distributed memory and dealing with about 20,000 transects along the Campania coastline. The system is based on the algorithms of the open-source numerical models WRF (Weather Research and Forecasting) and WW3 (Wave-watch III) implemented with refraction and shoaling routines together with run-up equations to form the modeling chain used for coastal flooding assessment.","PeriodicalId":153500,"journal":{"name":"2023 31st Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115796652","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bandit-based Variable Fixing for Binary Optimization on GPU Parallel Computing","authors":"Ryota Yasudo","doi":"10.1109/PDP59025.2023.00031","DOIUrl":"https://doi.org/10.1109/PDP59025.2023.00031","url":null,"abstract":"This paper explores whether reinforcement learning is capable of enhancing metaheuristics for the quadratic unconstrained binary optimization (QUBO), which have recently attracted attention as a solver for a wide range of combinatorial optimization problems. In particular, we introduce a novel approach called the bandit-based variable fixing (BVF). The key idea behind BVF is to regard an execution of an arbitrary metaheuristic with a variable fixed as a play of a slot machine. Thus, BVF explores variables to fix with the maximum expected reward, and executes a metaheuristic at the same time. The bandit-based approach is then extended to fix multiple variables. To accelerate solving multi-armed bandit problem, we implement a parallel algorithm for BVF on a GPU. Our results suggest that our proposed BVF enhances original metaheuristics.","PeriodicalId":153500,"journal":{"name":"2023 31st Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116030140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Danelutto, G. Mencagli, Alberto Ottimo, Francesco Iannone, P. Palazzari
{"title":"FastFlow targeting FPGAs","authors":"M. Danelutto, G. Mencagli, Alberto Ottimo, Francesco Iannone, P. Palazzari","doi":"10.1109/PDP59025.2023.00023","DOIUrl":"https://doi.org/10.1109/PDP59025.2023.00023","url":null,"abstract":"Writing good code for FPGA is a challenge “per se”, but also running already existing and optimized FPGA kernels often requires writing specific “host side” code and some target hardware knowledge to achieve good performances. In this work, we describe a FastFlow extension supporting seamless off loading of tasks to FPGA, once an FPGA kernel is available. In particular, we show how kernels implemented in Vitis and running on XILINX Alveo FPGA boards may be integrated to implement “normal” parallel stages (pipeline stages, map/farm workers) in a structured parallel FastFlow computation. Experimental results are shown, demonstrating the feasibility of the approach.","PeriodicalId":153500,"journal":{"name":"2023 31st Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)","volume":"87 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128626495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Synchronization Efficient Scheduling of Fine-grained Irregular Programs","authors":"Tao Tao","doi":"10.1109/PDP59025.2023.00044","DOIUrl":"https://doi.org/10.1109/PDP59025.2023.00044","url":null,"abstract":"This paper discusses the theory behind the global rebalancing policy, a new emerging paradigm of scheduling dynamic, irregular programs. According to this policy, task workload is distributed in rebalancing sessions enabled by global thread barriers, while traditional approaches such as work stealing rely on localized, concurrent deque operations. We show that the parallel execution model based on the global rebalancing policy has an amortized running time bound of $mathscr{O}(T_{1}/P+T_{infty})$, including all synchronization overhead. Based on this result, we further conclude that the global rebalancing policy asymptotically outperforms traditional work stealing as long as the input program is sufficiently parallel. We also present twelve benchmarks to demonstrate the parallel scalability of our system up to 16 processors: All benchmarks scale close to ideal.","PeriodicalId":153500,"journal":{"name":"2023 31st Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125338667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Dynamic Resource Partitioning for Multi-Tenant Systolic Array Based DNN Accelerator","authors":"M. Reshadi, David Gregg","doi":"10.1109/PDP59025.2023.00019","DOIUrl":"https://doi.org/10.1109/PDP59025.2023.00019","url":null,"abstract":"Deep neural networks (DNN) have become a significant applications in both cloud-server and edge devices. Meanwhile, the growing number of DNNs on those platforms raises the need to execute multiple DNNs on the same device. This paper proposes a dynamic partitioning algorithm to perform concurrent processing of multiple DNNs on asystolic-array-based accelerator. Sharing an accelerator's storage and processing resources across multiple DNNs increases resource utilization and reduces computation time and energy consumption. To this end, we propose a partitioned weight stationary dataflow with a minor modification in the logic of the processing element. We evaluate the energy consumption and computation time with both heavy and light workloads. Simulation results show a 35% and 62% improvement in energy consumption and 56% and 44% in computation time under heavy and light workloads, respectively, compared with single tenancy.","PeriodicalId":153500,"journal":{"name":"2023 31st Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)","volume":"104 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132831372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Toward Matrix Multiplication for Deep Learning Inference on the Xilinx Versal","authors":"Jie Lei, J. Flich, Enrique S. Quintana-Ort'i","doi":"10.1109/PDP59025.2023.00043","DOIUrl":"https://doi.org/10.1109/PDP59025.2023.00043","url":null,"abstract":"The remarkable positive impact of Deep Neural Networks on many Artificial Intelligence (AI) tasks has led to the development of various high performance algorithms as well as specialized processors and accelerators. In this paper we address this scenario by demonstrating that the principles underlying the modern realization of the general matrix multiplication (GEMM) in conventional processor architectures, are also valid to achieve high performance for the type of operations that arise in deep learning (DL) on an exotic accelerator such as the AI Engine (AIE) tile embedded in Xilinx Versal platforms. In particular, our experimental results with a prototype implementation of the GEMM kernel, on a Xilinx Versal VCK190, delivers performance close to 86.7% of the theoretical peak that can be expected on an AIE tile, for 16-bit integer operands.","PeriodicalId":153500,"journal":{"name":"2023 31st Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114836421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Efficient Accelerator for Deep Learning-based Point Cloud Registration on FPGAs","authors":"K. Sugiura, Hiroki Matsutani","doi":"10.1109/PDP59025.2023.00018","DOIUrl":"https://doi.org/10.1109/PDP59025.2023.00018","url":null,"abstract":"Point cloud registration is the basis for many robotic applications such as odometry and Simultaneous Localization And Mapping (SLAM), which are increasingly important for autonomous mobile robots. The limitation of computational resources and power budgets on such robots motivates us to study the resource-efficient registration method on low-cost edge devices. In this paper, we propose an FPGA-based novel pipeline for 3D point cloud registration built upon a recent deep learning-based method, PointNetLK. Based on the profiling results, we focus on the PointNet feature extraction as it becomes a major bottleneck; we improve its scalability and memory-efficiency by consuming each input point one-by-one in a pipelined manner instead of processing the whole point cloud at once. We then design a fully-parallelized and pipelined accelerator consisting of a custom PointNet IP core, which fits within both low-cost and mid-range FPGAs (e.g., Avnet Ultra96v2 and Xilinx ZCU104). Experimental results show that our proposed pipeline achieves up to 21.34x and 69.60x faster registration speed than the vanilla PointNetLK and ICP, respectively, while only consuming 722mW and maintaining the same level of accuracy.","PeriodicalId":153500,"journal":{"name":"2023 31st Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121771718","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}