2022 30th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)最新文献

筛选
英文 中文
A parallel software pipeline to select relevant genes for pathway enrichment 一个平行的软件管道来选择相关基因进行途径富集
Giuseppe Agapito, M. Cannataro
{"title":"A parallel software pipeline to select relevant genes for pathway enrichment","authors":"Giuseppe Agapito, M. Cannataro","doi":"10.1109/pdp55904.2022.00041","DOIUrl":"https://doi.org/10.1109/pdp55904.2022.00041","url":null,"abstract":"The continuous technological development of experimental omics technologies such as microarrays, allows to perform large scale genomics studies. After the initial enthusiasm, it became pretty clear that even the results provided by microarrays in form of lists of differential expressed genes (DEGs), were mainly as enigmatic as the first sequence of the genome, because these lists of DEGs are detached from the influenced biological mechanisms. Pathway enrichment analysis (PEA) supports researchers to provide the clues necessary to link DEGs to the influenced biological pathways and consequently to the underlying biological mechanisms and processes. Putting DEGs data sets in a suitable format for the PEA can be a tedious error-prone and laborious process even for bioinformaticians, who needs to perform it manually before to be ready for the PEA. To fill this lack, we present a parallel software pipeline which uploads a list of DEGs and automatically provides as results the enriched pathways.The parallel software pipeline is implemented in Python and provides the following automated actions: i) parallel splitting of DEGs in groups; ii) parallel building of the similarity matrices related to the DEGs groups; iii) parallel mapping of similarity matrices in networks; iv) parallel pathway enrichment analysis for each group of identified DEGs.Preliminary results shown that the pipeline can help to analyze DEGs and easily generate in a few minutes a list of pathway enrichment results that otherwise would require numerous hours of manual work and several different scripts.The parallel software pipeline provides a two-fold benefits: first, it contributes to speed up the computation of pathway enrichment, automating several steps currently performed manually. Second, it provides a more peculiar list of DEGs to calculate pathway enrichment, contributing to improve the relevance and significance of the enriched pathways.","PeriodicalId":210759,"journal":{"name":"2022 30th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117177548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Load Balancing of the Parallel Execution of Two Dimensional Partitioned Cellular Automata 二维分区元胞自动机并行执行的负载平衡
Andrea Giordano, Francesca Amelia, Salvatore Gigliotti, R. Rongo, W. Spataro
{"title":"Load Balancing of the Parallel Execution of Two Dimensional Partitioned Cellular Automata","authors":"Andrea Giordano, Francesca Amelia, Salvatore Gigliotti, R. Rongo, W. Spataro","doi":"10.1109/pdp55904.2022.00039","DOIUrl":"https://doi.org/10.1109/pdp55904.2022.00039","url":null,"abstract":"Load Balancing is generally referred as the technique to properly partition computation among processing elements in order to achieve optimal resource usage and thus reduce computation time. In this paper, we present a dynamic load balancing application in the context of the parallel execution of Cellular Automata where the domain space is partitioned in two dimensional regions that are assigned to different processing elements. Starting from general closed-form expressions that allow to compute the optimal workload assignment in a dynamic fashion when partitioning takes place along only one dimension, we extend the procedure to allow partitioning and balancing along both dimensions. As confirmed by the experimental results, two dimensional partitioning itself enables to speedup the execution, and further improvements are obtained when the load balancing occurs along both dimensions.","PeriodicalId":210759,"journal":{"name":"2022 30th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115342447","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Design and Evaluation of Multi-threaded Optimizations for Individual MPI I/O Operations 单个MPI I/O操作的多线程优化设计与评估
Raafat Feki, E. Gabriel
{"title":"Design and Evaluation of Multi-threaded Optimizations for Individual MPI I/O Operations","authors":"Raafat Feki, E. Gabriel","doi":"10.1109/pdp55904.2022.00027","DOIUrl":"https://doi.org/10.1109/pdp55904.2022.00027","url":null,"abstract":"Todays high-end parallel clusters are architecturally very complex. Most large scale applications nowadays are utilizing multiple parallel programming paradigms to achieve the required scalability, with MPI+threads being the most common approach. Yet, as of today, there is no parallel I/O library that matches this hybrid programming model. File I/O operations are typically executed by a single thread for each process. This paper explores multi-threaded optimizations for individual MPI I/O operations, an important step towards matching the execution model of modern parallel applications. We describe the changes necessary to the internal processing in the MPI I/O library as well as to the file access phase. We demonstrate the performance improvement of the redesigned functions using multiple benchmarks and on multiple platforms for many scenarios over the original, single-threaded version.","PeriodicalId":210759,"journal":{"name":"2022 30th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116415256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
SeRSS: a storage mesh architecture to build serverless reliable storage services SeRSS:用于构建无服务器可靠存储服务的存储网格架构
D. Carrizales-Espinoza, Dante D. Sánchez-Gallegos, J. L. González-Compeán, J. Carretero, R. Marcelín-Jiménez
{"title":"SeRSS: a storage mesh architecture to build serverless reliable storage services","authors":"D. Carrizales-Espinoza, Dante D. Sánchez-Gallegos, J. L. González-Compeán, J. Carretero, R. Marcelín-Jiménez","doi":"10.1109/pdp55904.2022.00022","DOIUrl":"https://doi.org/10.1109/pdp55904.2022.00022","url":null,"abstract":"Cloud storage has been the solution for organizations to manage the exponential growth of data observed over the past few years. However, end-users still suffer from side-effects of cloud service outages, which particularly affect edge-fog-cloud environments. This paper presents SeRSS, a storage mesh architecture to create and operate reliable, configurable, and flexible serverless storage services for heterogeneous infrastructures. A case study was conducted based on-the-fly building of storage services to manage medical imagery. The experimental evaluation revealed the efficiency of SeRSS to manage and store data in a reliable manner in heterogeneous infrastructures.","PeriodicalId":210759,"journal":{"name":"2022 30th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132978989","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A Scalable Architecture Exploiting Elastic Stack and Meta Ensemble of Classifiers for Profiling User Behaviour 利用弹性堆栈和元集成分类器分析用户行为的可扩展架构
G. Folino, Carla Otranto Godano, F. S. Pisani
{"title":"A Scalable Architecture Exploiting Elastic Stack and Meta Ensemble of Classifiers for Profiling User Behaviour","authors":"G. Folino, Carla Otranto Godano, F. S. Pisani","doi":"10.1109/pdp55904.2022.00037","DOIUrl":"https://doi.org/10.1109/pdp55904.2022.00037","url":null,"abstract":"Large user and application logs are generated and stored by many organisations at a rate that makes it really hard to analyse, especially in real-time. In particular, in the field of cybersecurity, it is of great interest to analyse fast user logs, coming from different and heterogeneous sources, in order to prevent data breach issues caused by user behaviour. In addition to these problems, often part of the data or some entire sources are missing. To overcome these issues, we propose a framework based on the Elastic Stack (ELK) to process and store log data coming from different users and applications to generate an ensemble of classifiers, in order to classify the user behaviour, and eventually to detect anomalies. The system exploits the scalable architecture of ELK by running on top of a Kubernetes platform and adopts a distributed evolutionary algorithm for classifying the users, on the basis of their digital footprints, derived by many sources of data. Preliminary experiments show that the system is effective in classifying the behaviour of the different users and that this can be considered as an auxiliary task for detecting anomalies in their behaviour, by helping to reduce the number of false alarms.","PeriodicalId":210759,"journal":{"name":"2022 30th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126587457","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Neural Network to Estimate Isolated Performance from Multi-Program Execution 一种估计多程序执行孤立性能的神经网络
Manel Lurbe, Josué Feliu, S. Petit, M. E. Gómez, J. Sahuquillo
{"title":"A Neural Network to Estimate Isolated Performance from Multi-Program Execution","authors":"Manel Lurbe, Josué Feliu, S. Petit, M. E. Gómez, J. Sahuquillo","doi":"10.1109/pdp55904.2022.00018","DOIUrl":"https://doi.org/10.1109/pdp55904.2022.00018","url":null,"abstract":"When multiple applications are running on a platform with shared resources like multicore CPUs, the behaviour of the running application can be altered by the co-runners. In this case, the system resources need to be managed (e.g. by repartitioning the cache space, re-schedule applications in distinct cores, modifying the prefetcher configuration, etc.) to reduce the inter-application interference in order to minimize the performance losses over isolated execution. In this context, a main challenge in different computing scenarios like the public cloud or soft real-time systems is knowing the performance impact of a given management action on each application with respect to its isolated execution. With this aim, in this work we present a neural network-based approach that estimates the performance an application would have had in isolation from multi-program executions. Experimental results show that the proposal dynamically adapts to changes in application behavior. On average, the predicted performance presents an error deviation by 11.7% and 2.3% for MAPE and MSE respectively.","PeriodicalId":210759,"journal":{"name":"2022 30th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127243616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
NAS Parallel Benchmark Kernels with Python: A performance and programming effort analysis focusing on GPUs NAS并行基准内核与Python:性能和编程工作分析的重点是gpu
D. D. Domenico, G. H. Cavalheiro, J. F. Lima
{"title":"NAS Parallel Benchmark Kernels with Python: A performance and programming effort analysis focusing on GPUs","authors":"D. D. Domenico, G. H. Cavalheiro, J. F. Lima","doi":"10.1109/pdp55904.2022.00013","DOIUrl":"https://doi.org/10.1109/pdp55904.2022.00013","url":null,"abstract":"GPU devices are currently seen as one of the trending topics for parallel computing. Commonly, GPU applications are developed with programming tools based on compiled languages, like C/C++ and Fortran. This paper presents a performance and programming effort analysis employing the Python high-level language to implement the NAS Parallel Benchmark kernels targeting GPUs. We used Numba environment to enable CUDA support in Python, a tool that allows us to implement a GPU application with pure Python code. Our experimental results showed that Python applications reached a performance similar to C++ programs employing CUDA and better than C++ using OpenACC for most NPB kernels. Furthermore, Python codes required less operations related to the GPU framework than CUDA, mainly because Python needs a lower number of statements to manage memory allocations and data transfers. However, our Python versions demanded more operations than OpenACC implementations.","PeriodicalId":210759,"journal":{"name":"2022 30th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124393405","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Mitigating Transceiver and Token Controller Permanent Faults in Wireless Network-on-Chip 减轻无线片上网络中的收发器和令牌控制器永久故障
Navonil Chatterjee, Marcelo Ruaro, Kevin J. M. Martin, J. Diguet
{"title":"Mitigating Transceiver and Token Controller Permanent Faults in Wireless Network-on-Chip","authors":"Navonil Chatterjee, Marcelo Ruaro, Kevin J. M. Martin, J. Diguet","doi":"10.1109/pdp55904.2022.00045","DOIUrl":"https://doi.org/10.1109/pdp55904.2022.00045","url":null,"abstract":"Conventional wired Network-on-Chip (NoC) designs suffer from performance degradation due to multi-hop long-distance communication. To address such a problem, in the past decade, researchers have been focused on investigating Wireless NoC (WiNoC), which evolved as a viable solution to mitigate this communication bottleneck by using single-hop long-range wireless links. However, many researchers reported that these interconnects may suffer failure due to the complexity of implementation. Although few works in the literature tackle faults in WiNoC, none of them provides a comprehensive study related to channel access mechanisms in the presence of faults. To fill this gap, we propose a fault aware WiNoC architecture. We discuss two types of faults in wireless interconnects, namely, transceiver faults and token controller faults. We provide different fault-tolerant techniques to deal with such faults. The proposed FTWiNoC presents, on average, 17.8% and 8.9% improvement in latency compared to two different fault mitigation strategies in the literature.","PeriodicalId":210759,"journal":{"name":"2022 30th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122023611","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Parallel integer multiplication 并行整数乘法
Vivien Samuel
{"title":"Parallel integer multiplication","authors":"Vivien Samuel","doi":"10.1109/pdp55904.2022.00024","DOIUrl":"https://doi.org/10.1109/pdp55904.2022.00024","url":null,"abstract":"Multiplication is a fundamental step in many algorithms. If the multiplication of two integers of n words has a complexity of M(n), divisions and squares can be computed in O(M(n)) as well and the greatest common divisor can be computed in O(M(n)logn). Thus being able to have a small value for M(n) is extremely important.To this day, the best known algorithm for reachable values is the Schönhage-Strassen algorithm which is implemented by a few arithmetic libraries. Asymptotically faster algorithms exist, however no computer is able to hold numbers big enough for those algorithms to outrun Schönhage-Strassen.The GNU Multiple Precision (GMP) library has a sequential-only implementation of Schönhage-Strassen.However some algorithms contains a step which is a single big multiplication. Thus when trying to parallelize such an algorithm, one requires a parallel algorithm for multiplication. An example of such an algorithm is the batch factorization for Number Field Sieve. Thus people trying to implement a parallel version of such algorithms need to find an arithmetic library that implements a parallel integer multiplication.An example of such a library is the Flint (Fast LIbrary for Number Theory) library that contains a parallel implementation of Schönhage-Strassen. In this article we present an implementation of Schönhage-Strassen, that reaches a speedup of 20 for the multiplication of two integers of 107 words of 64 bits using a Xeon Gold with 32 cores.","PeriodicalId":210759,"journal":{"name":"2022 30th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129791973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An efficient compilation of coarse-grained reconfigurable architectures utilizing pre-optimized sub-graph mappings 利用预先优化的子图映射的粗粒度可重构架构的有效编译
Ayaka Ohwada, Takuya Kojima, H. Amano
{"title":"An efficient compilation of coarse-grained reconfigurable architectures utilizing pre-optimized sub-graph mappings","authors":"Ayaka Ohwada, Takuya Kojima, H. Amano","doi":"10.1109/pdp55904.2022.00010","DOIUrl":"https://doi.org/10.1109/pdp55904.2022.00010","url":null,"abstract":"In recent years, IoT devices have become widespread, and energy-efficient coarse-grained reconfigurable architectures (CGRAs) have attracted attention. CGRAs comprise several processing units called processing elements (PEs) arranged in a two-dimensional array. The operations of PEs and the interconnections between them are adaptively changed depending on a target application, and this contributes to a higher energy efficiency compared to general-purpose processors. The application kernel executed on CGRAs is represented as a data flow graph (DFG), and CGRA compilers are responsible for mapping the DFG onto the PE array. Thus, mapping algorithms significantly influence the performance and power efficiency of CGRAs as well as the compile time. This paper proposes POCOCO, a compiler framework for CGRAs that can use pre-optimized subgraph mappings. This contributes to reducing the compiler optimization task. To leverage the subgraph mappings, we extend an existing mapping method based on a genetic algorithm. Experiments on three architectures demonstrated that the proposed method reduces the optimization time by 48%, on an average, for the best case of the three architectures.","PeriodicalId":210759,"journal":{"name":"2022 30th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132396732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信