软件和硬件中轻量级密码的安全有效掩蔽

IF 1.5 4区 计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE
Xuefeng Zhao
{"title":"软件和硬件中轻量级密码的安全有效掩蔽","authors":"Xuefeng Zhao","doi":"10.1093/comjnl/bxad002","DOIUrl":null,"url":null,"abstract":"Abstract Masking is a well used and widely deployed countermeasure against side channel attacks, both in software and hardware. With masking comes at a great cost, search has focused on how to lower a performance penalty or find efficient masking implementation. In particular, our contribution is 2-fold: for software masking, we first find bitsliced implementations of Sbox with Multiplicative Complexity 4 and Multiplicative Depth 2, then adapt the common shares approach introduced by Coron et al. at CHES 2016 to make many cross-products $a_{i}\\cdot b_{j}$ can be reuse for parallel ISW-based 32-bit nonlinear operations. Therefore, we improve the efficiency of 2$\\times b/4/32$ parallel high-order masking of ISW scheme for RECTANGLE, TANGRAM and KNOT on 32-bit ARM embedded microprocessor, with roughly a 13%-34% speed-up, at cost of $(1+d) \\times 32$-bit randomness. For hardware masking, 4 bit cubic Sboxes with quadratic decomposition length 2, including RECTANGLE, TANGRAM, KNOT and LWC third-round candidates, can be implemented with a 3-share and 4-share threshold implementation (TI) by decomposing cubic permutations $S$ as a composition of sub-permutations having lower algebraic degrees. We use two decomposition form: one composition of two quadratic permutations $G$ and $F$, $S = F\\circ G$, is for efficiency; the other composition of some linear permutations $A_i$ and one quadratic permutation $G$, $S=A_3 \\circ G \\circ A_2 \\circ G \\circ A_1 $, is for reducing the area requirements. For $S = F\\circ G$, we introduce a new approach of searching through all possible quadratic permutations $G$ with 2$^{25.71}$, which is effcient than 2$^{26.23}$ in Poschmann et al. at J. Cryptol 2011. For $S=A_3 \\circ G \\circ A_2 \\circ G \\circ A_1 $, our approach of finding $A_i$ with complexity 2$^{27.71} $, which is effcient than the method introduced by Moradi et al. at ASIACRYPT 2016. In addition, we proposes a new decomposition that $S=G \\circ A_2 \\circ G \\circ A_1 $. We can find the fastest and the smallest hard-ware decomposition implementation of 4-bit permutations for TI with 3 and 4 shares.","PeriodicalId":50641,"journal":{"name":"Computer Journal","volume":null,"pages":null},"PeriodicalIF":1.5000,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Secure and Efficient Masking of Lightweight Ciphers in Software and Hardware\",\"authors\":\"Xuefeng Zhao\",\"doi\":\"10.1093/comjnl/bxad002\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract Masking is a well used and widely deployed countermeasure against side channel attacks, both in software and hardware. With masking comes at a great cost, search has focused on how to lower a performance penalty or find efficient masking implementation. In particular, our contribution is 2-fold: for software masking, we first find bitsliced implementations of Sbox with Multiplicative Complexity 4 and Multiplicative Depth 2, then adapt the common shares approach introduced by Coron et al. at CHES 2016 to make many cross-products $a_{i}\\\\cdot b_{j}$ can be reuse for parallel ISW-based 32-bit nonlinear operations. Therefore, we improve the efficiency of 2$\\\\times b/4/32$ parallel high-order masking of ISW scheme for RECTANGLE, TANGRAM and KNOT on 32-bit ARM embedded microprocessor, with roughly a 13%-34% speed-up, at cost of $(1+d) \\\\times 32$-bit randomness. For hardware masking, 4 bit cubic Sboxes with quadratic decomposition length 2, including RECTANGLE, TANGRAM, KNOT and LWC third-round candidates, can be implemented with a 3-share and 4-share threshold implementation (TI) by decomposing cubic permutations $S$ as a composition of sub-permutations having lower algebraic degrees. We use two decomposition form: one composition of two quadratic permutations $G$ and $F$, $S = F\\\\circ G$, is for efficiency; the other composition of some linear permutations $A_i$ and one quadratic permutation $G$, $S=A_3 \\\\circ G \\\\circ A_2 \\\\circ G \\\\circ A_1 $, is for reducing the area requirements. For $S = F\\\\circ G$, we introduce a new approach of searching through all possible quadratic permutations $G$ with 2$^{25.71}$, which is effcient than 2$^{26.23}$ in Poschmann et al. at J. Cryptol 2011. For $S=A_3 \\\\circ G \\\\circ A_2 \\\\circ G \\\\circ A_1 $, our approach of finding $A_i$ with complexity 2$^{27.71} $, which is effcient than the method introduced by Moradi et al. at ASIACRYPT 2016. In addition, we proposes a new decomposition that $S=G \\\\circ A_2 \\\\circ G \\\\circ A_1 $. We can find the fastest and the smallest hard-ware decomposition implementation of 4-bit permutations for TI with 3 and 4 shares.\",\"PeriodicalId\":50641,\"journal\":{\"name\":\"Computer Journal\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":1.5000,\"publicationDate\":\"2023-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer Journal\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1093/comjnl/bxad002\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Journal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/comjnl/bxad002","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

摘要

掩蔽是一种应用广泛的对抗侧信道攻击的方法,无论是在软件还是硬件上都是如此。由于屏蔽的代价很大,搜索的重点是如何降低性能损失或找到有效的屏蔽实现。特别是,我们的贡献是双重的:对于软件屏蔽,我们首先找到了具有乘法复杂度4和乘法深度2的Sbox的位切片实现,然后采用Coron等人在CHES 2016上引入的公共共享方法,使许多交叉乘积$a_{i}\cdot b_{j}$可以被重用用于并行的基于isw的32位非线性操作。因此,我们在32位ARM嵌入式微处理器上,以$(1+d) $ × 32位随机性为代价,提高了2$\times b/4/32$并行ISW方案的高阶掩码效率,大约提高了13%-34%的速度。对于硬件掩蔽,通过将立方排列$S$分解为具有较低代数度的子排列的组合,可以用3共享和4共享阈值实现(TI)实现二次分解长度为2的4位立方Sboxes(包括RECTANGLE、TANGRAM、KNOT和LWC第三轮候选)。我们采用了两种分解形式:一种是由两个二次置换$G$和$F$组成,$S = F\circ G$,是为了效率;另一个线性排列$A_i$和一个二次排列$G$的组合$S=A_3 \circ G \circ A_2 \circ G \circ A_1 $是为了减少面积要求。对于$S = F\circ G$,我们引入了一种用2$^{25.71}$搜索所有可能的二次置换$G$的新方法,该方法比Poschmann et al. J. Cryptol 2011中的2$^{26.23}$高效。对于$S=A_3 \circ G \circ A_2 \circ G \circ A_1 $,我们找到复杂度为2$^{27.71}$的$A_i$的方法比Moradi等人在ASIACRYPT 2016上介绍的方法更有效。此外,我们提出了一个新的分解$S=G \circ A_2 \circ G \circ A_1 $。我们可以找到最快和最小的硬件分解实现的4位排列的TI与3和4份额。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Secure and Efficient Masking of Lightweight Ciphers in Software and Hardware
Abstract Masking is a well used and widely deployed countermeasure against side channel attacks, both in software and hardware. With masking comes at a great cost, search has focused on how to lower a performance penalty or find efficient masking implementation. In particular, our contribution is 2-fold: for software masking, we first find bitsliced implementations of Sbox with Multiplicative Complexity 4 and Multiplicative Depth 2, then adapt the common shares approach introduced by Coron et al. at CHES 2016 to make many cross-products $a_{i}\cdot b_{j}$ can be reuse for parallel ISW-based 32-bit nonlinear operations. Therefore, we improve the efficiency of 2$\times b/4/32$ parallel high-order masking of ISW scheme for RECTANGLE, TANGRAM and KNOT on 32-bit ARM embedded microprocessor, with roughly a 13%-34% speed-up, at cost of $(1+d) \times 32$-bit randomness. For hardware masking, 4 bit cubic Sboxes with quadratic decomposition length 2, including RECTANGLE, TANGRAM, KNOT and LWC third-round candidates, can be implemented with a 3-share and 4-share threshold implementation (TI) by decomposing cubic permutations $S$ as a composition of sub-permutations having lower algebraic degrees. We use two decomposition form: one composition of two quadratic permutations $G$ and $F$, $S = F\circ G$, is for efficiency; the other composition of some linear permutations $A_i$ and one quadratic permutation $G$, $S=A_3 \circ G \circ A_2 \circ G \circ A_1 $, is for reducing the area requirements. For $S = F\circ G$, we introduce a new approach of searching through all possible quadratic permutations $G$ with 2$^{25.71}$, which is effcient than 2$^{26.23}$ in Poschmann et al. at J. Cryptol 2011. For $S=A_3 \circ G \circ A_2 \circ G \circ A_1 $, our approach of finding $A_i$ with complexity 2$^{27.71} $, which is effcient than the method introduced by Moradi et al. at ASIACRYPT 2016. In addition, we proposes a new decomposition that $S=G \circ A_2 \circ G \circ A_1 $. We can find the fastest and the smallest hard-ware decomposition implementation of 4-bit permutations for TI with 3 and 4 shares.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Computer Journal
Computer Journal 工程技术-计算机:软件工程
CiteScore
3.60
自引率
7.10%
发文量
164
审稿时长
4.8 months
期刊介绍: The Computer Journal is one of the longest-established journals serving all branches of the academic computer science community. It is currently published in four sections.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信