{"title":"Constant weight strings in constant time: a building block for code-based post-quantum cryptosystems","authors":"Alessandro Barenghi, Gerardo Pelosi","doi":"10.1145/3387902.3392630","DOIUrl":"https://doi.org/10.1145/3387902.3392630","url":null,"abstract":"Code based cryptosystems often need to encode either a message or a random bitstring into one of fixed length and fixed (Hamming) weight. The lack of an efficient and reliable bijective map presents a problem in building constructions around the said cryptosystems to attain security against active attackers. We present an efficiently computable, bijective function which yields the desired mapping. Furthermore, we delineate how the said function can be computed in constant time. We experimentally validate the effectiveness and efficiency of our approach, comparing it against the current state of the art solutions, achieving three to four orders of magnitude improvements in computation time, and validate its constant runtime.","PeriodicalId":155089,"journal":{"name":"Proceedings of the 17th ACM International Conference on Computing Frontiers","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126155641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SCoEmbeddings","authors":"Hui Huang, Yueyuan Jin, Ruonan Rao","doi":"10.1145/3387902.3394948","DOIUrl":"https://doi.org/10.1145/3387902.3394948","url":null,"abstract":"Contextualized word representations such as ELMo embeddings, can capture rich semantic information and achieve impressive performance in a wide variety of NLP tasks. However, as problems found in Word2Vec and GloVe, we found that ELMo word embeddings also lack enough sentiment information, which may affect sentiment classification performance. Inspired by previous embedding refinement method with sentiment lexicon, we propose an approach that combines contextualized embeddings (ELMo) of the pre-trained model with sentiment information of lexicon to generate sentiment-contextualized embeddings, called SCoEmbeddings. Experimental results show that our SCoEmbeddings achieve higher accuracy than ELMo embeddings, Word2Vec embeddings, and refined Word2Vec embeddings on the SST-5 dataset. Meanwhile, we also visualize embeddings and weights of SCoEmbeddings, demonstrating the effectiveness of our SCoEmbeddings.","PeriodicalId":155089,"journal":{"name":"Proceedings of the 17th ACM International Conference on Computing Frontiers","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121028834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xu T. Liu, M. Halappanavar, K. Barker, A. Lumsdaine, A. Gebremedhin
{"title":"Direction-optimizing label propagation and its application to community detection","authors":"Xu T. Liu, M. Halappanavar, K. Barker, A. Lumsdaine, A. Gebremedhin","doi":"10.1145/3387902.3392634","DOIUrl":"https://doi.org/10.1145/3387902.3392634","url":null,"abstract":"Label Propagation, while more commonly known as a machine learning algorithm for classification, is also an effective method for detecting communities in networks. We propose a new Direction Optimizing Label Propagation Algorithm (DOLPA) that relies on the use of frontiers and alternates between label push and label pull operations to enhance the performance of the standard Label Propagation Algorithm (LPA). Specifically, DOLPA has parameters for tuning the processing order of vertices in a graph, which in turn reduces the number of edges visited and improves the quality of solution obtained. We apply DOLPA to the community detection problem, present the design and implementation of the algorithm, and discuss its shared-memory parallelization using OpenMP. Empirically, we evaluate our algorithm using synthetic graphs as well as real-world networks. Compared with the state-of-the-art Parallel Label Propagation algorithm, we achieve at least two times the F-Score while reducing the runtime by 50% for synthetic graphs with overlapping communities. We also compare DOLPA against state of the art parallel implementation of the Louvain method using the same graphs and show that DOLPA achieves about three times the F-Score at 10% the runtime.","PeriodicalId":155089,"journal":{"name":"Proceedings of the 17th ACM International Conference on Computing Frontiers","volume":"51 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130417880","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Engin Kayraklioglu, Jeff Anderson, H. Imani, V. Sorger, T. El-Ghazawi
{"title":"Software stack for an analog mesh computer: the case of a nanophotonic PDE accelerator","authors":"Engin Kayraklioglu, Jeff Anderson, H. Imani, V. Sorger, T. El-Ghazawi","doi":"10.1145/3387902.3394030","DOIUrl":"https://doi.org/10.1145/3387902.3394030","url":null,"abstract":"The slowing of Moore's Law is forcing the computer industry to embrace domain-specific hardware, which must be coupled with general-purpose traditional systems. This architecture is most useful when large compute power is needed. Among the most compute-intensive applications is the simulation of physical sciences. To maximize productivity in this domain, a variety accelerators have been proposed; however, the analog mesh computer has consistently been proven to require the shortest time-to-solution when targeted toward the Poisson equation. Recent advances in material science have increased the flexibility of the analog mesh computer, positioning it well for future heterogeneous computing systems. However, for the analog mesh computer to gain widespread acceptance, a software stack is required to enable seamless integration with a classical computer. Here, we introduce a software stack designed for the class of analog mesh computers that efficiently generates mesh mappings of a physical problem by enabling users to describe their problem in terms of boundary conditions and mesh parameters. Experiments on a specific implementation of analog mesh computer, the nanophotonic partial differential equation accelerator, show that this stack enables problem-to-mesh scalability expected by the scientific community.","PeriodicalId":155089,"journal":{"name":"Proceedings of the 17th ACM International Conference on Computing Frontiers","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132814306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Management of container-based genetic algorithm workloads over cloud infrastructure","authors":"Thamer Alrefai, L. Indrusiak","doi":"10.1145/3387902.3394031","DOIUrl":"https://doi.org/10.1145/3387902.3394031","url":null,"abstract":"This paper proposes two approaches to managing the workload of multiple instances of genetic algorithms (GAs) running as containers over a cloud environment. The aim of both approaches is to obtain, for as many instances as possible, a GA output which achieves a user-defined fitness level by a user-defined deadline. To reach such a goal, the proposed approaches allocate the GA containers to cloud nodes and carefully control the execution of every GA instance by forcing them to run in stages. The paper proposes two approaches, fitness tracking (FT) and fitness prediction (FP), with both approaches compared against state-of-the-art container-based orchestration approaches.","PeriodicalId":155089,"journal":{"name":"Proceedings of the 17th ACM International Conference on Computing Frontiers","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114468165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"RCecker","authors":"Xiaoxin Li, Jiazhen Li, Rui Hou, Dan Meng","doi":"10.1145/3387902.3392629","DOIUrl":"https://doi.org/10.1145/3387902.3392629","url":null,"abstract":"Return-oriented programming (ROP) is the major exploitation technique to hijack control flow in the presence of non-executable page protections. ROP can be prohibited by ensuring that ret targets legal position. One method is to check whether the predecessor of the target of a ret is a call to identify the illegal use of return. Performing check at each ret with low performance overhead is challenging. To reduce the performance overhead, prior proposals check at critical API functions or system calls and rely on the OS to identify these events. The goal of this paper is to mitigate ROP attacks while incurring negligible storage and performance overheads, and without relying on OS support. This paper proposes a hardware mechanism RCecker (Return-Call pair checker) to enforce the backward CFI (control flow integrity). We propose RCecker-S checking at each ret when the target of the ret has been figured out at EX stage. We analyze the cause of the high performance overhead of RCecker-S. We further propose RCecker-R checking only when RAS (Return Address Stack) mispredicts the targets to reduce the performance overhead. However, the attacker can use Spectre-like attack to pollute RAS and bypass the check of RCecker-R. We propose RCecker-spec based on RCecker-R in addition to check at each speculative ret when the target of the ret has been predicted at the fetch stage. We implement RCecker on RISCV BOOM core and evaluate its security effectiveness and performance overhead. RCecker-spec can successfully detect the ROP attacks in RIPE benchmark. For the SPECINT CPU2006 benchmark, the average performance overhead is 0.69%.","PeriodicalId":155089,"journal":{"name":"Proceedings of the 17th ACM International Conference on Computing Frontiers","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123525146","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An efficient object detection framework with modified dense connections for small objects optimizations","authors":"Yicong Zhang, Mingyu Wang, Zhaolin Li","doi":"10.1145/3387902.3392620","DOIUrl":"https://doi.org/10.1145/3387902.3392620","url":null,"abstract":"Object detection frameworks for small objects are increasingly demanded in some specific fields such as high-speed object tracking and remote sensing image recognition. In this paper, we propose an efficient object detection framework with modified dense connections for small objects. In order to improve both the detection accuracy and speed for small objects, the proposed framework constructs a convolutional neural network by using modified dense and residual cross-layer connections between multi-scale convolutional layers to extract deep features effectively. Based on the modified dense structure, a hybrid-scale feature fusion method is proposed to concatenate the multi-channel high-dimensional features and performs cross-entropy calculation and regression prediction. By using this method, this framework not only improves the detection accuracy for small objects significantly, but also improves the overall detection accuracy and optimizes the network parameters to reduce the detection time greatly. The experimental results show that the proposed framework achieves 90.6% mAP for small objects on a public ship dataset which is 25.2% more than SSD-VGGNet. Due to the detection efficiency for small objects, it improves the overall detection accuracy and detection speed by 9% and 40% respectively while about 70% network parameters are reduced.","PeriodicalId":155089,"journal":{"name":"Proceedings of the 17th ACM International Conference on Computing Frontiers","volume":"30 7","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120992499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Application-specific network-on-chip design space exploration framework for neuromorphic processor","authors":"Ziyang Kang, Shiying Wang, Lei Wang, Shiming Li, Lianhua Qu, Wei Shi, Rui Gong, Weixia Xu","doi":"10.1145/3387902.3392626","DOIUrl":"https://doi.org/10.1145/3387902.3392626","url":null,"abstract":"Neuromorphic processors can support the design of various Spiking Neural Networks (SNN) to deal with different tasks, such as recognition and tracking. Neuromorphic processors use Network-on-Chip (NoC) to support communication between neurons in SNN. The different SNN has different communication traffic patterns. It will pose the different challenges of the NoC designing. A reasonable NoC architecture can improve the overall performance such as lower latency of the processor. Hence, it is critical to implement the exploration of NoC architecture design for neuromorphic processors. This paper proposes a rapid NoC design space exploration (DSE) framework. As to our knowledge, it is the first work for the NoC DSE for the neuromorphic processor. The framework takes the spikes of the SNN application as input. It can support multiple optimization objectives for NoC design. Meanwhile, an optimized simulated annealing algorithm has been used to perform the DSE for the NoC design space. Then it outputs the final NoC design configuration. We apply this framework to 7 SNN applications to perform the NoC DSE. Compared with baseline NoC configuration, the NoC DSE framework can improve performance (Average Transport latency) by 54% to 93%. Compared with the Simulated Annealing (SA) algorithm, the Better-History SA (BHSA) algorithm speeds up the searching process by 1.5 to 8 times.","PeriodicalId":155089,"journal":{"name":"Proceedings of the 17th ACM International Conference on Computing Frontiers","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127234060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A comprehensive analysis of constant-time polynomial inversion for post-quantum cryptosystems","authors":"Alessandro Barenghi, Gerardo Pelosi","doi":"10.1145/3387902.3397224","DOIUrl":"https://doi.org/10.1145/3387902.3397224","url":null,"abstract":"Post-quantum cryptosystems have currently seen a surge in interest thanks to the current standardization initiative by the U.S.A. National Institute of Standards and Technology (NIST). A common primitive in post-quantum cryptosystems, in particular in code-based ones, is the computation of the inverse of a binary polynomial in a binary polynomial ring. In this work, we analyze, realize in software, and benchmark a broad spectrum of binary polynomial inversion algorithms, targeting operand sizes which are relevant for the current second round candidates in the NIST standardization process. We evaluate advantages and shortcomings of the different inversion algorithms, including their capability to run in constant-time, thus preventing timing side-channel attacks.","PeriodicalId":155089,"journal":{"name":"Proceedings of the 17th ACM International Conference on Computing Frontiers","volume":"15 5","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132090451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Scale-out beam longitudinal dynamics simulations","authors":"K. Iliakis, H. Timko, S. Xydis, D. Soudris","doi":"10.1145/3387902.3392616","DOIUrl":"https://doi.org/10.1145/3387902.3392616","url":null,"abstract":"Excessive studies and simulations are required to plan for the upcoming upgrades of the world's largest particle accelerators, and the design of future machines, given the technological challenges and tight budgetary constraints. The Beam Longitudinal Dynamics (BLonD) simulator suite incorporates the most detailed and complex physics phenomena in the field of longitudinal beam dynamics, required for providing extremely accurate predictions. These predictions are invaluable to the operation of existing accelerators, upcoming upgrades, and future studies. To undertake this agenda, and enable for the first time scale-out beam longitudinal dynamics simulations, we implement Hybrid-BLond, a distributed version of BLonD, that efficiently combines horizontal and vertical scaling. We propose a series of techniques that minimize the inter-node communication overhead and improve scalability. Firstly, we exploit mixed data and task parallelism opportunities. Secondly, we discuss two traffic optimisation techniques motivated by the properties of the simulated physics phenomena. Finally, we build a dynamic load-balancing scheme that coordinates effectively all the above features. We evaluate experimentally Hybrid-BLonD in an HPC cluster built with cutting-edge Intel servers and Infiniband interconnection network. Our fully-optimised implementation demonstrates an average 25.7X speedup over the previous state-of-the-art simulator when run on 32 computing nodes, across three real-world testcases.","PeriodicalId":155089,"journal":{"name":"Proceedings of the 17th ACM International Conference on Computing Frontiers","volume":"230 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134542904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}