{"title":"Position Paper: Consider Hardware-enhanced Defenses for Rootkit Attacks","authors":"Guangyuan Hu, Tianwei Zhang, Ruby B. Lee","doi":"10.1145/3458903.3458909","DOIUrl":"https://doi.org/10.1145/3458903.3458909","url":null,"abstract":"Rootkits are malware that attempt to compromise the system’s functionalities while hiding their existence. Various rootkits have been proposed as well as different software defenses, but only very few hardware defenses. We position hardware-enhanced rootkit defenses as an interesting research opportunity for computer architects, especially as many new hardware defenses for speculative execution attacks are being actively considered. We first describe different techniques used by rootkits and their prime targets in the operating system. We then try to shed insights on what the main challenges are in providing a rootkit defense, and how these may be overcome. We show how a hypervisor-based defense can be implemented, and provide a full prototype implementation in an open-source cloud computing platform, OpenStack. We evaluate the performance overhead of different defense mechanisms. Finally, we point to some research opportunities for enhancing resilience to rootkit-like attacks in the hardware architecture.","PeriodicalId":141766,"journal":{"name":"Hardware and Architectural Support for Security and Privacy","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116844879","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"FPGA Bitstream Modification with Interconnect in Mind","authors":"M. Moraitis, E. Dubrova","doi":"10.1145/3458903.3458908","DOIUrl":"https://doi.org/10.1145/3458903.3458908","url":null,"abstract":"Bitstream reverse engineering is traditionally associated with Intellectual Property (IP) theft. Another, less known, threat deriving from that is bitstream modification attacks. It has been shown that the secret key can be extracted from FPGA implementations of cryptographic algorithms by injecting faults directly into the bitstream. Such bitstream modification attacks rely on changing the content of Look Up Tables (LUTs). Therefore, related countermeasures aim to make the task of identifying a LUT more difficult (e.g. by masking LUT content). However, recent advances in FPGA reverse engineering revealed information on how interconnects are encoded in the bitstream of Xilinx 7 series FPGAs. In this paper, we show that this knowledge can be used to break or weaken existing countermeasures, as well as improve existing attacks. Furthermore, a straightforward attack that re-routes the key to an output pin becomes possible. We demonstrate our claims on an FPGA implementation of SNOW 3G stream cipher, a core algorithm for confidentiality and integrity used in several 3GPP wireless communication standards, including the new Next Generation 5G.","PeriodicalId":141766,"journal":{"name":"Hardware and Architectural Support for Security and Privacy","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131651016","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Analysis and Hardware Optimization of Lattice Post-Quantum Cryptography Workloads","authors":"Sandhya Koteshwara, M. Kumar, P. Pattnaik","doi":"10.1145/3458903.3458905","DOIUrl":"https://doi.org/10.1145/3458903.3458905","url":null,"abstract":"The mathematical constructs, nature of computations and challenges in optimizing lattice post-quantum cryptographic algorithms on modern many-core processors are discussed in this paper. Identification of time-consuming functions and subsequent hardware optimization using vector units and hardware accelerators of one of the candidates, CRYSTALS-Kyber, leads to performance improvement of around 52% for its SHA3 variant and 83% for its AES variant. Detailed Cycles-per-Instruction (CPI) stack breakdown before and after optimization indicates a CPI of around 0.5 and dominance of load/store operations in these workloads.","PeriodicalId":141766,"journal":{"name":"Hardware and Architectural Support for Security and Privacy","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115019631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Implementing the Draft RISC-V Scalar Cryptography Extensions","authors":"Ben Marshall, D. Page, T. Pham","doi":"10.1145/3458903.3458904","DOIUrl":"https://doi.org/10.1145/3458903.3458904","url":null,"abstract":"RISC-V is an increasingly popular, free and open Instruction Set Architecture (ISA). Many standard extensions to RISC-V are currently being designed and evaluated, including one for accelerating cryptographic workloads. Unlike most incumbent ISAs which re-use existing large SIMD state and data-paths to accelerate cryptographic operations, RISC-V also adds support for smaller machines with narrow 32 and 64-bit data-paths. For embedded, IoT class devices, this significantly lowers the barrier to entry for secure and efficient accelerated cryptography. In this paper, we describe (to our knowledge) the first complete, free and open-source implementation of the draft 32-bit RISC-V Cryptography Extension. We detail the performance benefits for several important algorithms, and associated hardware costs. Our experiences help to guide the ongoing standardisation work and provide a platform for other researchers to experiment with a complete and representative CPU system, implementing the draft cryptography extension.","PeriodicalId":141766,"journal":{"name":"Hardware and Architectural Support for Security and Privacy","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127114886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Uncovering Hidden Instructions in Armv8-A Implementations","authors":"Fredrik Strupe, Rakesh Kumar","doi":"10.1145/3458903.3458906","DOIUrl":"https://doi.org/10.1145/3458903.3458906","url":null,"abstract":"Though system and application level security has received and continue to receive significant attention, interest in hardware security has spiked only in the last few years. The majority of recently disclosed hardware security attacks exploit well known and documented hardware behaviours such as speculation, cache and memory timings, etc. We observe that security exploits in undocumented hardware behaviour can have even more severe consequences as such behaviour is rarely verified and protected against. This paper introduces armshaker, a tool to uncover one such undocumented behaviour in the Armv8 architecture, namely hidden instructions. These are the instructions that are not documented in the ISA reference manual, but still execute successfully. We tested five different Armv8-A hardware platforms from four different vendors, as well as two Armv8-A emulators, and uncovered multiple hidden instructions. An interesting finding is that, though we did not discover any hidden instruction in the hardware itself, bugs in the system software can induce hidden instructions in the system that, from a user’s perspective, are indistinguishable from hidden instructions in hardware. Though armshaker did not find any hidden instruction in the hardware of the tested platforms, their existence cannot be ruled out, given the diversity of available Arm processors. Consequently, we make armshaker publicly available as open-source software to enable users to audit their own systems for hidden instructions.","PeriodicalId":141766,"journal":{"name":"Hardware and Architectural Support for Security and Privacy","volume":"122 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122134096","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Deeksha Dangwal, M. Cowan, Armin Alaghi, Vincent T. Lee, Brandon Reagen, Caroline Trippel
{"title":"SoK: Opportunities for Software-Hardware-Security Codesign for Next Generation Secure Computing","authors":"Deeksha Dangwal, M. Cowan, Armin Alaghi, Vincent T. Lee, Brandon Reagen, Caroline Trippel","doi":"10.1145/3458903.345891","DOIUrl":"https://doi.org/10.1145/3458903.345891","url":null,"abstract":"Users are demanding increased data security. As a result, security is rapidly becoming a first-order design constraint in next generation computing systems. Researchers and practitioners are exploring various security technologies to meet user demand such as trusted execution environments (e.g., Intel SGX, ARM TrustZone), homomorphic encryption, and differential privacy. Each technique provides some degree of security, but differs with respect to threat coverage, performance overheads, as well as implementation and deployment challenges. In this paper, we present a systemization of knowledge (SoK) on these design considerations and trade-offs using several prominent security technologies. Our study exposes the need for software-hardware-security codesign to realize efficient and effective solutions of securing user data. In particular, we explore how design considerations across applications, hardware, and security mechanisms must be combined to overcome fundamental limitations in current technologies so that we can minimize performance overhead while achieving sufficient threat model coverage. Finally, we propose a set of guidelines to facilitate putting these secure computing technologies into practice.","PeriodicalId":141766,"journal":{"name":"Hardware and Architectural Support for Security and Privacy","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128303089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. T. Markettos, John Baldwin, Ruslan Bukin, P. Neumann, S. Moore, R. Watson
{"title":"Position Paper:Defending Direct Memory Access with CHERI Capabilities","authors":"A. T. Markettos, John Baldwin, Ruslan Bukin, P. Neumann, S. Moore, R. Watson","doi":"10.1145/3458903.3458910","DOIUrl":"https://doi.org/10.1145/3458903.3458910","url":null,"abstract":"We propose new solutions that can efficiently address the problem of malicious memory access from pluggable computer peripherals and microcontrollers embedded within a system-on-chip. This problem represents a serious emerging threat to total-system computer security. Previous work has shown that existing defenses are insufficient and poorly deployed, in part due to performance concerns. In this paper we explore the threat and its implications for system architecture. We propose a range of protection techniques, from lightweight to heavyweight, across different classes of systems. We consider how emerging capability architectures (and specifically the CHERI protection model) can enhance protection and provide a convenient bridge to describe interactions among software and hardware components. Finally, we describe how new schemes may be more efficient than existing defenses.","PeriodicalId":141766,"journal":{"name":"Hardware and Architectural Support for Security and Privacy","volume":"07 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127260864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SIMD Instruction Set Extensions for Keccak with Applications to SHA-3, Keyak and Ketje","authors":"Hemendra K. Rawat, P. Schaumont","doi":"10.1145/2948618.2948622","DOIUrl":"https://doi.org/10.1145/2948618.2948622","url":null,"abstract":"Recent processor architectures such as Intel Westmere (and later) and ARMv8 include instruction-level support for the Advanced Encryption Standard (AES), for the Secure Hashing Standard (SHA-1, SHA2) and for carry-less multiplication. These crypto-instruction sets provide specialized hardware processing at the top of the memory hierarchy, and provide significant performance improvements over general-purpose software for common cryptographic operations. We propose a crypto-instruction set for the Keccak cryptographic sponge and for the Keccak duplex construction. Our design is integrated on a 128 bit SIMD interface, applicable to the ARM NEON and Intel AVX (128 bit) architecture. The proposed instruction set is optimized for flexibility and supports multiple variants of the Keccak-f[b] permutation, for b equal to 200, 400, 800, or 1600 bit. We investigate the performance of the design using the GEM5 micro-architecture simulator. Compared to the latest hand-optimized results, we demonstrate a performance improvement of 2 times (over NEON programming) to 6 times (over Assembly programming). For example, an optimized NEON implementation of SHA3-512 computes a hash at 48.1 instructions per byte, while our design uses 21.9 instructions per byte. The NEON optimized version of the Lake Keyak AEAD uses 13.4 instructions per byte, while our design uses 7.7 instructions per byte. We provide comprehensive performance evaluation for multiple configurations of the Keccak-f[b] permutation in multiple applications (Hash, Encryption, AEAD). We also analyze the hardware cost of the proposed instructions in gate-equivalent of 90nm standard cells, and show that the proposed instructions only require 4658 GE, a fraction of the cost of a full ARM Cortex-A9.","PeriodicalId":141766,"journal":{"name":"Hardware and Architectural Support for Security and Privacy","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134357469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bin Cedric Xing, Mark Shanahan, Rebekah Leslie-Hurd
{"title":"Intel® Software Guard Extensions (Intel® SGX) Software Support for Dynamic Memory Allocation inside an Enclave","authors":"Bin Cedric Xing, Mark Shanahan, Rebekah Leslie-Hurd","doi":"10.1145/2948618.2954330","DOIUrl":"https://doi.org/10.1145/2948618.2954330","url":null,"abstract":"Intel® Software Guard Extensions (Intel® SGX) SGX2 extends the Intel® Software Guard Extensions (SGX) instruction set and enables software developers to dynamically manage memory within the SGX environment. This paper reviews the current SGX Software RunTime Environment and proposes additions to the framework which will allow developers to benefit from features enabled by SGX2 such as dynamic heap management, stack expansion, and thread context creation.","PeriodicalId":141766,"journal":{"name":"Hardware and Architectural Support for Security and Privacy","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127849413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Leonid Azriel, R. Ginosar, S. Gueron, A. Mendelson
{"title":"Using Scan Side Channel for Detecting IP Theft","authors":"Leonid Azriel, R. Ginosar, S. Gueron, A. Mendelson","doi":"10.1145/2948618.2948619","DOIUrl":"https://doi.org/10.1145/2948618.2948619","url":null,"abstract":"We present a process for detection of IP theft in VLSI devices that exploits the internal test scan chains. The IP owner learns implementation details in the suspect device to find evidence of the theft, while the top level function is public. The scan chains supply direct access to the internal registers in the device, thus making it possible to learn the logic functions of the internal combinational logic chunks. Our work introduces an innovative way of applying Boolean function analysis techniques for learning digital circuits with the goal of IP theft detection. By using Boolean function learning methods, the learner creates a partial dependency graph of the internal flip-flops. The graph is further partitioned using the SNN graph clustering method, and individual blocks of combinational logic are isolated. These blocks can be matched with known building blocks that compose the original function. This enables reconstruction of the function implementation to the level of pipeline structure. The IP owner can compare the resulting structure with his own implementation to confirm or refute that an IP violation has occurred. We demonstrate the power of the presented approach with a test case of an open source Bitcoin SHA-256 accelerator, containing more than 80,000 registers. With the presented method we discover the microarchitecture of the module, locate all the main components of the SHA-256 algorithm, and learn the module's flow control.","PeriodicalId":141766,"journal":{"name":"Hardware and Architectural Support for Security and Privacy","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133754082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}