Philipp Koppermann, F. D. Santis, Johann Heyszl, G. Sigl
{"title":"Automatic generation of high-performance modular multipliers for arbitrary mersenne primes on FPGAs","authors":"Philipp Koppermann, F. D. Santis, Johann Heyszl, G. Sigl","doi":"10.1109/HST.2017.7951794","DOIUrl":"https://doi.org/10.1109/HST.2017.7951794","url":null,"abstract":"Modular multiplication is a fundamental and performance determining operation in various public-key cryptosystems. High-performance modular multipliers on FPGAs are commonly realized by several small-sized multipliers, an adder tree for summing up the digit-products, and a reduction circuit. While small-sized multipliers are available in pre-fabricated high-speed DSP slices, the adder tree and the reduction circuit are implemented in standard logic. The latter operations represent the performance bottleneck to high-performance implementations. Previous works attempted to minimize the critical path of the adder tree by rearranging digit-products on digit-level. We report improved performance by regrouping digit-products on bit-level, while incorporating the reduction for Mersenne primes. Our approach leads to very fast modular multipliers, whose latency and throughput characteristics outperform all previous results. We formalize our approach and provide algorithms to automatically generate high-performance modular multipliers for arbitrary Mersenne primes from any small-sized multipliers.","PeriodicalId":190635,"journal":{"name":"2017 IEEE International Symposium on Hardware Oriented Security and Trust (HOST)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121159842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Circuit recognition with deep learning","authors":"Yu-Yun Dai, R. Brayton","doi":"10.1109/HST.2017.7951826","DOIUrl":"https://doi.org/10.1109/HST.2017.7951826","url":null,"abstract":"Identifying properties (features) of circuits and applying proper algorithms are helpful for solving various computer-aided design problems. For hardware security inspection, there is a demand for reverse engineering, the process of extracting high-level components from bit-level designs. Given a suspected circuit block, a common approach is to find a set of candidate functions and then to apply formal methods to identify it. Identifying useful features of high-level functions and collecting suggested candidates of an unknown block are important steps. Convolutional neural networks (CNNs) have been used extensively in machine learning because often pre-defined features are not required. Deep networks with multiple processing layers have been shown to be capable of learning concealed structures of objects during a training process. This paper discusses requirements for representing logic circuits for CNN processing. A new circuit representation (data format) is developed for the proposed circuit-based convolution operation with dynamic pooling. Based on this data format, a deep learning framework using CNNs to recognize circuit functionalities was built. Compared to reference methods based on support vector machines (SVM), experiments demonstrate the effectiveness of the proposed CNN method for both circuit classification as well as function detection and location. With proper training data, e.g. a set of circuits with hidden Trojans, the proposed framework can be used to train a model to help detect and locate malware in hardware designs.","PeriodicalId":190635,"journal":{"name":"2017 IEEE International Symposium on Hardware Oriented Security and Trust (HOST)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122550435","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient configurations for block ciphers with unified ENC/DEC paths","authors":"S. Banik, A. Bogdanov, F. Regazzoni","doi":"10.1109/HST.2017.7951795","DOIUrl":"https://doi.org/10.1109/HST.2017.7951795","url":null,"abstract":"Block Ciphers providing the combined functionalities of encryption and decryption are required to operate in modes of operation like CBC and ELmD. Hence such architectures form critical building blocks for secure cryptographic implementations. Depending on the algebraic structure of a given cipher, there may be multiple ways of constructing the combined encryption/decryption circuit, each targeted at optimizing lightweight design metrics like area or power etc. In this paper we look at how the choice of circuit configuration affects the energy required to perform one encryption/decryption. We begin by analyzing 12 circuit configurations for the Advanced Encryption Standard (AES-128) cipher and establish some design rules for energy efficiency. We then extend our analysis to several lightweight block ciphers. In the second part of the paper we also investigate area optimized circuits for combined implementations of these ciphers.","PeriodicalId":190635,"journal":{"name":"2017 IEEE International Symposium on Hardware Oriented Security and Trust (HOST)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130506163","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On designing optimal camouflaged layouts","authors":"T. Broadfoot, C. Sechen, J. Rajendran","doi":"10.1109/HST.2017.7951833","DOIUrl":"https://doi.org/10.1109/HST.2017.7951833","url":null,"abstract":"Integrated circuit (IC) camouflaging is a layout-level technique that hampers reverse-engineering attacks. In one embodiment of camouflaging, layouts of different Boolean gates are designed to look alike by using a combination of true and dummy contacts. The security of IC camouflaging using dummy contacts depends on an attackers inability to determine whether a contact is true or dummy. The layouts of camouflaged gates in prior works incur tremendous overhead: 4x in area, 5.5x in power, and 1.8x in delay. Thus, overhead is a major impediment to adoption of IC camouflaging. To solve this problem, we propose an algorithm to generate the layouts of camouflaged gates using dummy contacts for static CMOS logic. In this work, we develop low-cost camouflaging solutions using a graph-theoretic approach. Given an optimization objective and a list of Boolean functions whose layouts have to look alike, the proposed algorithm produces look-alike layouts that are optimized (minimized) in terms of area, power, delay, or a combination. Results indicate that the proposed design is more robust against variations and has better noise margin than the existing techniques.","PeriodicalId":190635,"journal":{"name":"2017 IEEE International Symposium on Hardware Oriented Security and Trust (HOST)","volume":"49 9","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120920882","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
André Schaller, Wenjie Xiong, N. Anagnostopoulos, Muhammad Umair Saleem, Sebastian Gabmeyer, S. Katzenbeisser, Jakub Szefer
{"title":"Intrinsic Rowhammer PUFs: Leveraging the Rowhammer effect for improved security","authors":"André Schaller, Wenjie Xiong, N. Anagnostopoulos, Muhammad Umair Saleem, Sebastian Gabmeyer, S. Katzenbeisser, Jakub Szefer","doi":"10.1109/HST.2017.7951729","DOIUrl":"https://doi.org/10.1109/HST.2017.7951729","url":null,"abstract":"Physically Unclonable Functions (PUFs) have become an important and promising hardware primitive for device fingerprinting, device identification, or key storage. Intrinsic PUFs leverage components already found in existing devices, unlike extrinsic silicon PUFs, which are based on customized circuits that involve modification of hardware. In this work, we present a new type of a memory-based intrinsic PUF, which leverages the Rowhammer effect in DRAM modules — the Rowhammer PUF. Our PUF makes use of bit flips, which occur in DRAM cells due to rapid and repeated access of DRAM rows. Prior research has mainly focused on Rowhammer attacks, where the Rowhammer effect is used to illegitimately alter data stored in memory, e.g., to change page table entries or enable privilege escalation attacks. Meanwhile, this is the first work to use the Rowhammer effect in a positive context — to design a novel PUF. We extensively evaluate the Rowhammer PUF using commercial, off-the-shelf devices, not relying on custom hardware or an FPGA-based setup. The evaluation shows that the Rowhammer PUF holds required properties needed for the envisioned security applications, and could be deployed today.","PeriodicalId":190635,"journal":{"name":"2017 IEEE International Symposium on Hardware Oriented Security and Trust (HOST)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129236343","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
D. Das, Shovan Maity, Saad Bin Nasir, Santosh K. Ghosh, A. Raychowdhury, Shreyas Sen
{"title":"High efficiency power side-channel attack immunity using noise injection in attenuated signature domain","authors":"D. Das, Shovan Maity, Saad Bin Nasir, Santosh K. Ghosh, A. Raychowdhury, Shreyas Sen","doi":"10.1109/HST.2017.7951799","DOIUrl":"https://doi.org/10.1109/HST.2017.7951799","url":null,"abstract":"With the advancement of technology in the last few decades, leading to the widespread availability of miniaturized sensors and internet-connected things (IoT), security of electronic devices has become a top priority. Side-channel attack (SCA) is one of the prominent methods to break the security of an encryption system by exploiting the information leaked from the physical devices. Correlational power attack (CPA) is an efficient power side-channel attack technique, which analyses the correlation between the estimated and measured supply current traces to extract the secret key. The existing countermeasures to the power attacks are mainly based on reducing the SNR of the leaked data, or introducing large overhead using techniques like power balancing. This paper presents an attenuated signature AES (AS-AES), which resists SCA with minimal noise current overhead. AS-AES uses a shunt low-drop-out (LDO) regulator to suppress the AES current signature by 400x in the supply current traces. The shunt LDO has been fabricated and validated in 130 nm CMOS technology. System-level implementation of the AS-AES along with noise injection, shows that the system remains secure even after 50K encryptions, with 10x reduction in power overhead compared to that of noise addition alone.","PeriodicalId":190635,"journal":{"name":"2017 IEEE International Symposium on Hardware Oriented Security and Trust (HOST)","volume":"93 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124170217","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Øzone: Efficient execution with zero timing leakage for modern microarchitectures","authors":"Zelalem Birhanu Aweke, T. Austin","doi":"10.1109/HST.2017.7951817","DOIUrl":"https://doi.org/10.1109/HST.2017.7951817","url":null,"abstract":"Time variation during program execution can leak sensitive information. Time variations due to program control flow and hardware resource contention have been used to steal encryption keys in cipher implementations such as AES and RSA. A number of approaches to mitigate timing-based side-channel attacks have been proposed including cache partitioning, control-flow obfuscation and injecting timing noise into the outputs of code. While these techniques make timing-based side-channel attacks more difficult, they do not eliminate the risks. Prior techniques are either too specific or too expensive, and all leave remnants of the original timing side channel for later attackers to attempt to exploit. In this work, we show that the state-of-the-art techniques in timing side-channel protection, which limit timing leakage but do not eliminate it, still have significant vulnerabilities to timing-based side-channel attacks. To provide a means for total protection from timing-based side-channel attacks, we develop Ozone, the first zero timing leakage execution resource for a modern microarchitecture. Code in Ozone execute under a special hardware thread that gains exclusive access to a single cores resources for a fixed (and limited) number of cycles during which it cannot be interrupted. Memory access under Ozone thread execution is limited to a fixed size uncached scratchpad memory, and all Ozone threads begin execution with a known fixed microarchitectural state. We evaluate Ozone using a number of security sensitive kernels that have previously been targets of timing side-channel attacks, and show that Ozone eliminates timing leakage with minimal performance overhead.","PeriodicalId":190635,"journal":{"name":"2017 IEEE International Symposium on Hardware Oriented Security and Trust (HOST)","volume":"141 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130924305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
T. M. John, Syed Kamran Haider, H. Omar, Marten van Dijk
{"title":"Connecting the dots: Privacy leakage via write-access patterns to the main memory","authors":"T. M. John, Syed Kamran Haider, H. Omar, Marten van Dijk","doi":"10.1109/HST.2017.7951834","DOIUrl":"https://doi.org/10.1109/HST.2017.7951834","url":null,"abstract":"Data-dependent access patterns of an application to an untrusted storage system are notorious for leaking sensitive information about the user's data. Previous research has shown how an adversary capable of monitoring both read and write requests issued to the memory can correlate them with the application to learn its sensitive data. However, information leakage through only the write access patterns is less obvious and not well studied in the current literature. In this work, we demonstrate an actual attack on power-side-channel resistant Montgomery's ladder based modular exponentiation algorithm commonly used in public key cryptography. We infer the complete 512-bit secret exponent in ∼ 3 5 minutes by virtue of just the write access patterns of the algorithm to the main memory. In order to learn the victim algorithm's write access patterns under realistic settings, we exploit a compromised DMA device to take frequent snapshots of the application's address space, and then run a simple differential analysis on these snapshots to find the write access sequence. The attack has been shown on an Intel Core(TM) i7-4790 3.60GHz processor based system. Although our exploitation strategy to infer the write access patterns has certain limitations, it conveys the underlying message that even if only the write access sequence is given, the application's sensitive information can be learned. We also discuss some techniques to overcome these limitations, and also some countermeasures to prevent such attacks.","PeriodicalId":190635,"journal":{"name":"2017 IEEE International Symposium on Hardware Oriented Security and Trust (HOST)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116030860","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}