Juelin Zhang, Taoyun Wang, Yiteng Sun, Fanjie Ji, Bohan Wang, Lu Li, Yu Yu, Weijia Wang
{"title":"基于表的高效屏蔽与预处理","authors":"Juelin Zhang, Taoyun Wang, Yiteng Sun, Fanjie Ji, Bohan Wang, Lu Li, Yu Yu, Weijia Wang","doi":"10.46586/tches.v2024.i3.273-301","DOIUrl":null,"url":null,"abstract":"Masking is one of the most investigated countermeasures against sidechannel attacks. In a nutshell, it randomly encodes each sensitive variable into a number of shares, and compiles the cryptographic implementation into a masked one that operates over the shares instead of the original sensitive variables. Despite its provable security benefits, masking inevitably introduces additional overhead. Particularly, the software implementation of masking largely slows down the cryptographic implementations and requires a large number of random bits that need to be produced by a true random number generator. In this respect, reducing the< overhead of masking is still an essential and challenging task. Among various known schemes, Table-Based Masking (TBM) stands out as a promising line of work enjoying the advantages of generality to any lookup tables. It also allows the pre-processing paradigm, wherein a pre-processing phase is executed independently of the inputs, and a much more efficient online (using the precomputed tables) phase takes place to calculate the result. Obviously, practicality of pre-processing paradigm relies heavily on the efficiency of online phase and the size of precomputed tables.In this paper, we investigate the TBM scheme that offers a combination of linear complexity (in terms of the security order, denoted as d) during the online phase and small precomputed tables. We then apply our new scheme to the AES-128, and provide an implementation on the ARM Cortex architecture. Particularly, for a security order d = 8, the online phase outperforms the current state-of-the-art AES implementations on embedded processors that are vulnerable to the side-channel attacks. The security order of our scheme is proven in theory and verified by the T-test in practice. Moreover, we investigate the speed overhead associated with the random bit generation in our masking technique. Our findings indicate that the speed overhead can be effectively balanced. This is mainly because that the true random number generator operates in parallel with the processor’s execution, ensuring a constant supply of fresh random bits for the masked computation at regular intervals.","PeriodicalId":321490,"journal":{"name":"IACR Transactions on Cryptographic Hardware and Embedded Systems","volume":" 31","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Efficient Table-Based Masking with Pre-processing\",\"authors\":\"Juelin Zhang, Taoyun Wang, Yiteng Sun, Fanjie Ji, Bohan Wang, Lu Li, Yu Yu, Weijia Wang\",\"doi\":\"10.46586/tches.v2024.i3.273-301\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Masking is one of the most investigated countermeasures against sidechannel attacks. In a nutshell, it randomly encodes each sensitive variable into a number of shares, and compiles the cryptographic implementation into a masked one that operates over the shares instead of the original sensitive variables. Despite its provable security benefits, masking inevitably introduces additional overhead. Particularly, the software implementation of masking largely slows down the cryptographic implementations and requires a large number of random bits that need to be produced by a true random number generator. In this respect, reducing the< overhead of masking is still an essential and challenging task. Among various known schemes, Table-Based Masking (TBM) stands out as a promising line of work enjoying the advantages of generality to any lookup tables. It also allows the pre-processing paradigm, wherein a pre-processing phase is executed independently of the inputs, and a much more efficient online (using the precomputed tables) phase takes place to calculate the result. Obviously, practicality of pre-processing paradigm relies heavily on the efficiency of online phase and the size of precomputed tables.In this paper, we investigate the TBM scheme that offers a combination of linear complexity (in terms of the security order, denoted as d) during the online phase and small precomputed tables. We then apply our new scheme to the AES-128, and provide an implementation on the ARM Cortex architecture. Particularly, for a security order d = 8, the online phase outperforms the current state-of-the-art AES implementations on embedded processors that are vulnerable to the side-channel attacks. The security order of our scheme is proven in theory and verified by the T-test in practice. Moreover, we investigate the speed overhead associated with the random bit generation in our masking technique. Our findings indicate that the speed overhead can be effectively balanced. This is mainly because that the true random number generator operates in parallel with the processor’s execution, ensuring a constant supply of fresh random bits for the masked computation at regular intervals.\",\"PeriodicalId\":321490,\"journal\":{\"name\":\"IACR Transactions on Cryptographic Hardware and Embedded Systems\",\"volume\":\" 31\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-07-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IACR Transactions on Cryptographic Hardware and Embedded Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.46586/tches.v2024.i3.273-301\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IACR Transactions on Cryptographic Hardware and Embedded Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.46586/tches.v2024.i3.273-301","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
屏蔽是针对侧信道攻击研究最多的对策之一。简而言之,它将每个敏感变量随机编码成若干个份额,然后将加密实现编译成一个掩码实现,对这些份额而不是原始敏感变量进行操作。尽管掩码具有可证明的安全优势,但不可避免地会带来额外的开销。特别是,掩码的软件实现在很大程度上减慢了加密实现的速度,并且需要大量的随机比特,而这些随机比特需要由真正的随机数发生器产生。因此,减少掩码开销仍然是一项重要而具有挑战性的任务。在各种已知方案中,基于表的掩码(TBM)是一种很有前途的方案,它具有适用于任何查找表的优点。它还允许预处理范例,其中预处理阶段独立于输入执行,而更高效的在线(使用预计算表)阶段则用于计算结果。显然,预处理范式的实用性在很大程度上取决于在线阶段的效率和预计算表的大小。在本文中,我们研究了 TBM 方案,该方案将在线阶段的线性复杂性(以安全阶表示,用 d 表示)和较小的预计算表结合在一起。然后,我们将新方案应用于 AES-128,并在 ARM Cortex 架构上进行了实现。特别是在安全阶数 d = 8 的情况下,在线阶段的性能优于目前嵌入式处理器上最先进的 AES 实现,因为嵌入式处理器容易受到侧信道攻击。我们方案的安全阶在理论上得到了证明,并在实践中通过 T 测试进行了验证。此外,我们还研究了我们的掩码技术中与随机比特生成相关的速度开销。我们的研究结果表明,速度开销可以得到有效平衡。这主要是因为真正的随机数生成器是与处理器的执行并行运行的,从而确保了每隔一定时间就能为屏蔽计算持续提供新的随机比特。
Masking is one of the most investigated countermeasures against sidechannel attacks. In a nutshell, it randomly encodes each sensitive variable into a number of shares, and compiles the cryptographic implementation into a masked one that operates over the shares instead of the original sensitive variables. Despite its provable security benefits, masking inevitably introduces additional overhead. Particularly, the software implementation of masking largely slows down the cryptographic implementations and requires a large number of random bits that need to be produced by a true random number generator. In this respect, reducing the< overhead of masking is still an essential and challenging task. Among various known schemes, Table-Based Masking (TBM) stands out as a promising line of work enjoying the advantages of generality to any lookup tables. It also allows the pre-processing paradigm, wherein a pre-processing phase is executed independently of the inputs, and a much more efficient online (using the precomputed tables) phase takes place to calculate the result. Obviously, practicality of pre-processing paradigm relies heavily on the efficiency of online phase and the size of precomputed tables.In this paper, we investigate the TBM scheme that offers a combination of linear complexity (in terms of the security order, denoted as d) during the online phase and small precomputed tables. We then apply our new scheme to the AES-128, and provide an implementation on the ARM Cortex architecture. Particularly, for a security order d = 8, the online phase outperforms the current state-of-the-art AES implementations on embedded processors that are vulnerable to the side-channel attacks. The security order of our scheme is proven in theory and verified by the T-test in practice. Moreover, we investigate the speed overhead associated with the random bit generation in our masking technique. Our findings indicate that the speed overhead can be effectively balanced. This is mainly because that the true random number generator operates in parallel with the processor’s execution, ensuring a constant supply of fresh random bits for the masked computation at regular intervals.