Anais Estendidos do Simpósio em Sistemas Computacionais de Alto Desempenho (WSCAD)最新文献_第2页

An Approach for Evaluating and Mitigating Intra-Application I/O Performance Variability Over Parallel File Systems 一种评估和减轻并行文件系统上应用程序内部I/O性能可变性的方法

Anais Estendidos do Simpósio em Sistemas Computacionais de Alto Desempenho (WSCAD) Pub Date : 2019-11-12 DOI: 10.5753/wscad_estendido.2019.8709

E. C. Inacio, M. Dantas

{"title":"An Approach for Evaluating and Mitigating Intra-Application I/O Performance Variability Over Parallel File Systems","authors":"E. C. Inacio, M. Dantas","doi":"10.5753/wscad_estendido.2019.8709","DOIUrl":"https://doi.org/10.5753/wscad_estendido.2019.8709","url":null,"abstract":"To meet ever increasing capacity and performance requirements of emerging data-intensive applications, highly distributed and multilayered back-end storage systems have been employed in large-scale high performance computing (HPC) environments. A main component of these storage infrastructures is the parallel file system (PFS), a especially designed file system for absorbing bulk data transfers from applications with thousands of concurrent processes. Load distribution on PFS data servers compose a major source of intra-application input/output (I/O) performance variability. Albeit mitigating variability is desirable, as it is known to harm application-perceived performance, understanding and dealing with I/O performance variability in such complex environments remains a challenging task. In this research, a differentiated approach for evaluating and mitigating intra-application I/O performance variability over PFSs is proposed. More specifically, from the evaluation perspective, a comprehensive approach combining complementary methods is proposed. An analytical model proposal, named DTSMaxLoad, provides estimates for the maximum load in a PFS data server. To complement DTSMaxLoad, modeling conditions and mechanisms hard to represent analytically, the Parallel I/O and Storage System (PIOSS) simulation model was proposed. Finally, for experimental evaluation over real environments, a flexible and distributed I/O performance evaluation tool, coined as IOR-Extended (IORE), was proposed. Furthermore, a high-level file distribution approach for PFSs, called N-N Round-Robin (N2R2), was proposed focusing on mitigating I/O performance variability for distributed applications where each process accesses an individual and independent file. An extensive experimental effort, including measurements on real environments, was conducted in this research work for evaluating each of the proposed approaches. In summary, this evaluation indicated both DTSMaxLoad and PIOSS modeling proposals can represent load distribution behavior on PFSs with significant fidelity. Moreover, results demonstrated N2R2 successfully reduced intra-application I/O performance variability for 270 distinct experimental scenarios, which, ultimately, translated into overall application I/O performance Improvements.","PeriodicalId":280012,"journal":{"name":"Anais Estendidos do Simpósio em Sistemas Computacionais de Alto Desempenho (WSCAD)","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126754310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Avaliação de Máquinas Preemptáveis nos Provedores de Nuvem Pública Amazon e Google 对公共云提供商Amazon和谷歌的预加载机器进行评估

Anais Estendidos do Simpósio em Sistemas Computacionais de Alto Desempenho (WSCAD) Pub Date : 2019-11-12 DOI: 10.5753/wscad_estendido.2019.8695

J. Soares, Aleteia Araujo

引用次数: 0

Soluções Paralelas para o Problema de Roteamento Usando o Algoritmo de Lee 用Lee算法并行求解路由问题

Anais Estendidos do Simpósio em Sistemas Computacionais de Alto Desempenho (WSCAD) Pub Date : 2019-11-12 DOI: 10.5753/wscad_estendido.2019.8692

William Tavares, Nahri Moreano

引用次数: 0

Secure and efficient software implementation of QC-MDPC code-based cryptography 基于QC-MDPC代码的安全高效的软件实现

Anais Estendidos do Simpósio em Sistemas Computacionais de Alto Desempenho (WSCAD) Pub Date : 2019-11-12 DOI: 10.5753/wscad_estendido.2019.8710

A. Guimarães, Diego F. Aranha, E. Borin

{"title":"Secure and efficient software implementation of QC-MDPC code-based cryptography","authors":"A. Guimarães, Diego F. Aranha, E. Borin","doi":"10.5753/wscad_estendido.2019.8710","DOIUrl":"https://doi.org/10.5753/wscad_estendido.2019.8710","url":null,"abstract":"The emergence of quantum computers is pushing an unprecedented transition in the public key cryptography field. Conventional algorithms, mostly represented by elliptic curves and RSA, are vulnerable to attacks using quantum computers and need, therefore, to be replaced. Cryptosystems based on error-correcting codes are considered some of the most promising candidates to replace them for encryption schemes. Among the code families, QC-MDPC codes achieve the smallest key sizes while maintaining the desired security properties. Their performance, however, still needs to be greatly improved to reach a competitive level. In this work, we focus on optimizing the performance of QC-MDPC code-based cryptosystems through improvements concerning both their implementations and algorithms. We first present a new enhanced version of QcBits' key encapsulation mechanism, which is a constant time implementation of the Niederreiter cryptosystem using QC-MDPC codes. In this version, we updated the implementation parameters to meet the 128-bit quantum security level, replaced some of the core algorithms avoiding slower instructions, vectorized the entire code using the AVX 512 instruction set extension and introduced some other minor improvements. Comparing with the current state-of-the-art implementation for QC-MDPC codes, the BIKE implementation, our code performs 1.9 times faster when decrypting messages. We then optimize the performance of QC-MDPC code-based cryptosystems through the insertion of a configurable failure rate in their arithmetic procedures. We present constant time algorithms with a configurable failure rate for multiplication and inversion over binary polynomials, the two most expensive subroutines used in QC-MDPC implementations. Using a failure rate negligible compared to the security level (2^{-128}), our multiplication is 2 times faster than the one used in the NTL library on sparse polynomials and 1.6 times faster than a naive constant-time sparse polynomial multiplication. Our inversion algorithm, based on the inversion algorithm of Wu et al., is 2 times faster than the original and 12 times faster than the inversion algorithm of Itoh and Tsujii using the same modulus polynomial (x^{32749} - 1). By inserting these algorithms in our enhanced version of QcBits, we were able to achieve a speedup of 1.9 on the key generation and up to 1.4 on the decryption time. Comparing with BIKE, our final version of QcBits performs the uniform decryption 2.7 times faster. Moreover, the techniques presented in this work can also be applied to BIKE, opening new possibilities for further Improvements.","PeriodicalId":280012,"journal":{"name":"Anais Estendidos do Simpósio em Sistemas Computacionais de Alto Desempenho (WSCAD)","volume":"804 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133138563","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Paralelização do Algoritmo de Indexação de Dados Multimídia Baseado em Quantização 基于量化的多媒体数据索引算法的并行化

Anais Estendidos do Simpósio em Sistemas Computacionais de Alto Desempenho (WSCAD) Pub Date : 2019-11-12 DOI: 10.5753/wscad_estendido.2019.8699

André Fernandes, G. Teodoro

引用次数: 0

Otimização para Ambientes Intel(R) de um Metodo Numérico para o Escoamento Bifásico de Fluidos em Meios Porosos Através da Eliminação de Barreiras OpenMP 在Intel(R)环境下，通过消除开放屏障来优化多孔介质中两相流体流动的数值方法

Anais Estendidos do Simpósio em Sistemas Computacionais de Alto Desempenho (WSCAD) Pub Date : 2019-11-12 DOI: 10.5753/wscad_estendido.2019.8700

Weber Ribeiro, Thiago Teixeira, F. Cabral, M. R. Borges, C. Osthoff

{"title":"Otimização para Ambientes Intel(R) de um Metodo Numérico para o Escoamento Bifásico de Fluidos em Meios Porosos Através da Eliminação de Barreiras OpenMP","authors":"Weber Ribeiro, Thiago Teixeira, F. Cabral, M. R. Borges, C. Osthoff","doi":"10.5753/wscad_estendido.2019.8700","DOIUrl":"https://doi.org/10.5753/wscad_estendido.2019.8700","url":null,"abstract":"Este artigo apresenta otimizações de um método numérico para o escoamento bifásico de fluidos em meios porosos, voltado à execução paralela em ambientes Intel R . As ferramentas do suı́te Intel R Parallel Studios XE, foram utilizadas no estudo de possı́veis implementações. A implementação EWS-SYNC consiste em substituir as barreiras do OpenMP por um mecanismo explı́cito de sincronismo entre threads, o MPI é implementado para comunicação entre diversos processadores distribuı́dos e tornar o código utilizável em ambiente Cluster. Foram comparados os resultados para o aumento de número de processos no novo código MPI com o aumento do número de threads no código EWSSYNC. A implementação EWS-SYNC obteve Speedup de 27x, comparado-se a execução serial, utilizando-se o hardware Intel R Xeon Phi (KNL) @ 1.40GHz com 68 cores fı́sicos 4 threads/core em uma máquina que contém Intel Xeon CPU E5-2698 v3 @ 2.30GHz com 32 cores fı́sicos em [Teixeira et al. 2018]. Comparando-se o Speedup do código EWS-SYNC em relação ao código serial em arquitetura Intel Xeon R CPU E5-2698 v3 @ 2.30GHz 16 cores fı́sicos o Speedup foi de 10x e nesta mesma arquitetura o ainda em fase inicial de implementação código MPI em relação ao EWS-SYNC obteve Speedup de 23x.","PeriodicalId":280012,"journal":{"name":"Anais Estendidos do Simpósio em Sistemas Computacionais de Alto Desempenho (WSCAD)","volume":"65 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122202494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Uma Abordagem em Ambiente Domiciliar Assistido Baseada no Paradigma de Segurança Orientada a Contexto 基于面向环境的安全范式的辅助家庭环境方法

Anais Estendidos do Simpósio em Sistemas Computacionais de Alto Desempenho (WSCAD) Pub Date : 2019-11-12 DOI: 10.5753/WSCAD_ESTENDIDO.2019.8693

Franco Umilio, E. C. Inacio, M. Dantas

引用次数: 1

Utilizando a biblioteca PAPI para avaliar diferentes abordagens de construção de curvas b-spline 利用PAPI库评估构建b样条曲线的不同方法。

Anais Estendidos do Simpósio em Sistemas Computacionais de Alto Desempenho (WSCAD) Pub Date : 2019-11-12 DOI: 10.5753/wscad_estendido.2019.8698

Vitor David, D. Araujo, Marcelo Zamith, Ubiratam de Paula

引用次数: 0

Structural testing criteria for concurrent programs considering loop executions 考虑循环执行的并发程序的结构测试标准

Anais Estendidos do Simpósio em Sistemas Computacionais de Alto Desempenho (WSCAD) Pub Date : 2019-11-12 DOI: 10.5753/wscad_estendido.2019.8711

Sílvia M. D. Diaz, P. S. Souza

{"title":"Structural testing criteria for concurrent programs considering loop executions","authors":"Sílvia M. D. Diaz, P. S. Souza","doi":"10.5753/wscad_estendido.2019.8711","DOIUrl":"https://doi.org/10.5753/wscad_estendido.2019.8711","url":null,"abstract":"Parallel programs are imperative for improving performance and problem solving, having an increasing demand on implementing efficient parallel programming techniques. This entails new challenges on software testing to ensure their quality and reliability. Structural testing is a technique that allows the identification of concurrency defects by analyzing the internal structure of the program. However, the non-determinism of concurrent programs has implications in the testing activity, requiring the use of structured methods to reveal defects. Testing criteria support the selection of test cases in a systematic form by statically analysing elements of concurrent programs. We found that there are currently gaps in the definition of testing criteria contemplating scenarios with elements that are dynamically evaluated, such as the execution of communication primitives inside loops. The objective of this project is to define structural testing criteria to guide the selection of test cases, improving the reliability of concurrent programs by revealing non-determinism related errors present in repetition structures. We developed a Concurrent Defects Taxonomy, identifying and classifying concurrency types of defects found in related literature. The analysis of such defects, paths inside loops, number of loop iterations, and nested loops allow us to model the proposed structural testing criteria. We define new sets and associations related to communication and synchronization flows for message-passing programs, establishing a model for testing criteria. We implemented the proposed test model in ValiMPI, a testing tool prototype, considering the new concepts defined in our test model, generating required elements and evaluating coverage after constructing loop paths. For the application evaluation of criteria we perform an empirical study with statistical validation, indicating the results for cost, effectiveness and strength. Our experimental evaluation demonstrated that the proposed testing criteria generates required elements that support the identification of concurrency defects occurring in different loop iterations, when having communicational events with non-deterministic behavior.","PeriodicalId":280012,"journal":{"name":"Anais Estendidos do Simpósio em Sistemas Computacionais de Alto Desempenho (WSCAD)","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115426014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1