Microprocessors and Microsystems最新文献

筛选
英文 中文
Hardware security against IP piracy using secure fingerprint encrypted fused amino-acid biometric with facial anthropometric signature 使用安全指纹加密融合氨基酸生物特征与面部人体特征签名的硬件安全防止IP盗版
IF 1.9 4区 计算机科学
Microprocessors and Microsystems Pub Date : 2025-02-01 DOI: 10.1016/j.micpro.2024.105131
Anirban Sengupta, Aditya Anshul, Ayush Kumar Singh
{"title":"Hardware security against IP piracy using secure fingerprint encrypted fused amino-acid biometric with facial anthropometric signature","authors":"Anirban Sengupta,&nbsp;Aditya Anshul,&nbsp;Ayush Kumar Singh","doi":"10.1016/j.micpro.2024.105131","DOIUrl":"10.1016/j.micpro.2024.105131","url":null,"abstract":"<div><div>In the era of modern global design supply chain, the emergence of hardware threats is on the rise. Conventional hardware security techniques may fall short in terms of offering inferior tamper tolerance, unpersuasive digital ownership proof and weaker entropy, for sturdy intellectual property (IP) piracy detection and seamless IP ownership conflict resolution process. This paper presents a novel hardware security methodology based on IP seller's amino acid biometric and facial anthropometric features to generate an encrypted fused signature using multi-key driven non-invertible fingerprint, for providing sturdy detective countermeasure against IP piracy. The proposed approach exploits AES framework, where the generated key-translated fingerprint minutiae points of the IP seller is used as an encryption key. The proposed methodology is highly robust against hardware threats as it capable to generate large size covert security constraints for embedding, as digital evidence, in the IP design during high level synthesis (HLS). The results of the proposed approach on comparison with existing approaches, indicates enhanced tamper tolerance ability (against brute force attack) of upto 1.15E+77, lower probability of coincidence or false positive (against ghost signature search attack) of upto 6.72E-06, and stronger entropy of upto 2.06E-138, respectively.</div></div>","PeriodicalId":49815,"journal":{"name":"Microprocessors and Microsystems","volume":"112 ","pages":"Article 105131"},"PeriodicalIF":1.9,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143147889","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Design and implementation of a synchronous Hardware Performance Monitor for a RISC-V space-oriented processor 面向RISC-V空间处理器的同步硬件性能监视器的设计与实现
IF 1.9 4区 计算机科学
Microprocessors and Microsystems Pub Date : 2025-02-01 DOI: 10.1016/j.micpro.2024.105132
Miguel Jiménez Arribas, Agustín Martínez Hellín, Manuel Prieto Mateo, Iván Gamino del Río, Andrea Fernández Gallego, Óscar Rodríguez Polo, Antonio da Silva, Pablo Parra, Sebastián Sánchez
{"title":"Design and implementation of a synchronous Hardware Performance Monitor for a RISC-V space-oriented processor","authors":"Miguel Jiménez Arribas,&nbsp;Agustín Martínez Hellín,&nbsp;Manuel Prieto Mateo,&nbsp;Iván Gamino del Río,&nbsp;Andrea Fernández Gallego,&nbsp;Óscar Rodríguez Polo,&nbsp;Antonio da Silva,&nbsp;Pablo Parra,&nbsp;Sebastián Sánchez","doi":"10.1016/j.micpro.2024.105132","DOIUrl":"10.1016/j.micpro.2024.105132","url":null,"abstract":"<div><div>The ability to collect statistics about the execution of a program within a CPU is of the utmost importance across all fields of computing since it allows characterizing the timing performance of a program. This capability is even more relevant in safety-critical software systems, where it is mandatory to analyze the software timing requirements to ensure the correct operation of the programs. Moreover, in order to properly evaluate and verify the extra-functional properties of these systems, besides timing performance, there are many other statistics available on a CPU, such as those associated with its resource utilization. In this paper, we showcase a Performance Measurement Unit (PMU), also known as a Hardware Performance Monitor (HPM), integrated into a RISC-V On-Board Computer (OBC) designed for space applications by our research group. The monitoring technique features a novel approach whereby the events triggered are not counted immediately but instead are propagated through the pipeline so that their annotation is synchronized with the executed instruction. Additionally, we also demonstrate the use of this PMU in a process to characterize the execution model of the processor. Finally, as an example of the statistics provided by the PMU, the results obtained running the CoreMark and Dhrystone benchmarks on the RISC-V OBC are shown.</div></div>","PeriodicalId":49815,"journal":{"name":"Microprocessors and Microsystems","volume":"112 ","pages":"Article 105132"},"PeriodicalIF":1.9,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143147890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient Coarse-Grained Reconfigurable Array architecture for machine learning applications in space using DARE65T library platform 利用 DARE65T 库平台为空间机器学习应用设计高效的粗粒度可重构阵列架构
IF 1.9 4区 计算机科学
Microprocessors and Microsystems Pub Date : 2025-01-14 DOI: 10.1016/j.micpro.2025.105142
Luca Zulberti , Matteo Monopoli , Pietro Nannipieri , Silvia Moranti , Geert Thys , Luca Fanucci
{"title":"Efficient Coarse-Grained Reconfigurable Array architecture for machine learning applications in space using DARE65T library platform","authors":"Luca Zulberti ,&nbsp;Matteo Monopoli ,&nbsp;Pietro Nannipieri ,&nbsp;Silvia Moranti ,&nbsp;Geert Thys ,&nbsp;Luca Fanucci","doi":"10.1016/j.micpro.2025.105142","DOIUrl":"10.1016/j.micpro.2025.105142","url":null,"abstract":"&lt;div&gt;&lt;div&gt;With the increasing use of satellites, rovers, and other space exploration devices, Artificial Intelligence (AI) is also becoming an important tool for space exploration, allowing autonomous decision-making and operations in harsh environments. As a result, there is an increasing demand for reliable and energy-efficient processing platforms in the space industry. Among all processing architectures, Coarse-Grained Reconfigurable Arrays (CGRAs) are becoming popular, particularly in data-intensive applications like machine learning, demonstrating a substantial improvement in the energy efficiency of inference operations while preserving a good degree of versatility. In high-level class space missions, the hardware platforms incorporate radiation-hardened Field Programmable Gate Arrays (FPGAs) and microcontrollers, which do not meet the performance requirements for the aforementioned AI applications. The use of CGRA architectures in space missions is still not widely studied. The main contribution of this work is a comprehensive Design Space Exploration (DSE) activity with our highly parameterized CGRA architecture, exploring the costs associated with various design parameters when targeting AI in the space domain. We evaluated performance, power consumption, and area occupation after synthesis on the radiation-hardened DARE65T standard cell library developed by imec, based on a commercial 65 nm technology process. We characterize different CGRA configurations, comparing them with state-of-the-art solutions used for the acceleration of the AI algorithms. This work highlights Performance, Power, and Area (PPA) results that range from &lt;span&gt;&lt;math&gt;&lt;mrow&gt;&lt;mi&gt;100&lt;/mi&gt;&lt;mspace&gt;&lt;/mspace&gt;&lt;mi&gt;MHz&lt;/mi&gt;&lt;/mrow&gt;&lt;/math&gt;&lt;/span&gt; (up to &lt;span&gt;&lt;math&gt;&lt;mrow&gt;&lt;mi&gt;600&lt;/mi&gt;&lt;mspace&gt;&lt;/mspace&gt;&lt;mi&gt;MOps&lt;/mi&gt;&lt;/mrow&gt;&lt;/math&gt;&lt;/span&gt;), &lt;span&gt;&lt;math&gt;&lt;mrow&gt;&lt;mi&gt;2.43&lt;/mi&gt;&lt;mo&gt;×&lt;/mo&gt;&lt;msup&gt;&lt;mrow&gt;&lt;mi&gt;10&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mi&gt;4&lt;/mi&gt;&lt;/mrow&gt;&lt;/msup&gt;&lt;mspace&gt;&lt;/mspace&gt;&lt;mstyle&gt;&lt;mstyle&gt;&lt;mi&gt;μ&lt;/mi&gt;&lt;/mstyle&gt;&lt;/mstyle&gt;&lt;msup&gt;&lt;mrow&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mi&gt;2&lt;/mi&gt;&lt;/mrow&gt;&lt;/msup&gt;&lt;/mrow&gt;&lt;/math&gt;&lt;/span&gt; cell area occupation and &lt;span&gt;&lt;math&gt;&lt;mrow&gt;&lt;mi&gt;0.699&lt;/mi&gt;&lt;mspace&gt;&lt;/mspace&gt;&lt;mi&gt;mW&lt;/mi&gt;&lt;/mrow&gt;&lt;/math&gt;&lt;/span&gt; power consumption, to &lt;span&gt;&lt;math&gt;&lt;mrow&gt;&lt;mi&gt;625&lt;/mi&gt;&lt;mspace&gt;&lt;/mspace&gt;&lt;mi&gt;MHz&lt;/mi&gt;&lt;/mrow&gt;&lt;/math&gt;&lt;/span&gt; (up to &lt;span&gt;&lt;math&gt;&lt;mrow&gt;&lt;mi&gt;3.75&lt;/mi&gt;&lt;mspace&gt;&lt;/mspace&gt;&lt;mi&gt;GOps&lt;/mi&gt;&lt;/mrow&gt;&lt;/math&gt;&lt;/span&gt;), &lt;span&gt;&lt;math&gt;&lt;mrow&gt;&lt;mi&gt;2.43&lt;/mi&gt;&lt;mo&gt;×&lt;/mo&gt;&lt;msup&gt;&lt;mrow&gt;&lt;mi&gt;10&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mi&gt;5&lt;/mi&gt;&lt;/mrow&gt;&lt;/msup&gt;&lt;mspace&gt;&lt;/mspace&gt;&lt;mstyle&gt;&lt;mstyle&gt;&lt;mi&gt;μ&lt;/mi&gt;&lt;/mstyle&gt;&lt;/mstyle&gt;&lt;msup&gt;&lt;mrow&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mi&gt;2&lt;/mi&gt;&lt;/mrow&gt;&lt;/msup&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;mi&gt;46.5&lt;/mi&gt;&lt;mspace&gt;&lt;/mspace&gt;&lt;mi&gt;mW&lt;/mi&gt;&lt;/mrow&gt;&lt;/math&gt;&lt;/span&gt;. During DSE activity, we highlight the optimal solutions in terms of area efficiency (up to &lt;span&gt;&lt;math&gt;&lt;mrow&gt;&lt;mi&gt;313.1&lt;/mi&gt;&lt;mspace&gt;&lt;/mspace&gt;&lt;msup&gt;&lt;mrow&gt;&lt;mi&gt;GOps/mm&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mi&gt;2&lt;/mi&gt;&lt;/mrow&gt;&lt;/msup&gt;&lt;/mrow&gt;&lt;/math&gt;&lt;/span&gt;) and energy efficiency (up to &lt;span&gt;&lt;math&gt;&lt;mrow&gt;&lt;mi&gt;289&lt;/mi&gt;&lt;mspace&gt;&lt;/mspace&gt;&lt;mi&gt;GOps/W&lt;/mi&gt;&lt;/","PeriodicalId":49815,"journal":{"name":"Microprocessors and Microsystems","volume":"113 ","pages":"Article 105142"},"PeriodicalIF":1.9,"publicationDate":"2025-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143180785","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hardware implementation of a high-resolution auto-tuned time-frequency signal analyzer over TMS320C6713 DSK using a compact support polynomial kernel 基于TMS320C6713 DSK的高分辨率自调谐时频信号分析仪的硬件实现,采用紧凑的支持多项式核
IF 1.9 4区 计算机科学
Microprocessors and Microsystems Pub Date : 2025-01-09 DOI: 10.1016/j.micpro.2025.105141
Ibrahim Lantri , Mansour Abed , Adel Belouchrani
{"title":"Hardware implementation of a high-resolution auto-tuned time-frequency signal analyzer over TMS320C6713 DSK using a compact support polynomial kernel","authors":"Ibrahim Lantri ,&nbsp;Mansour Abed ,&nbsp;Adel Belouchrani","doi":"10.1016/j.micpro.2025.105141","DOIUrl":"10.1016/j.micpro.2025.105141","url":null,"abstract":"<div><div>This paper explores the hardware implementation of an embedded time-frequency signal analyzer using the Polynomial Cheriet-Belouchrani Distribution (PCBD) with a compact kernel. We implemented this distribution on a Texas Instruments TMS320C6713 Digital Signal Processing Starter Kit (DSK). Compared to other quadratic time-frequency distributions (TFDs), the PCBD requires a low computational cost due to its compact support nature, which reduces the number of points needing calculation. The sole smoothing parameter <em>γ</em> that controls its kernel's bandwidth is an integer, simplifying the unsupervised approach. To ensure that the realized TF analyzer is automatically tuned, an accurate low-complexity performance measure must be employed to achieve optimal concentration, resolution, and cross-term suppression. Failure to do so may result in missing or degraded essential signal characteristics. The Stankovic measure has been identified as the preferred measure among many others for finding the optimal value of the integer <em>γ</em>. We have also been exploring methods to optimize the execution of various algorithms by taking advantage of specific mathematical properties inherent in the compact polynomial kernel and the PCBD. Additionally, we propose a recursive method to minimize the computation cost associated with the discrete PCB kernel. These strategies are designed to enhance efficiency and reduce the required machine cycles. To compare the performances provided, we thoroughly evaluate the numerical complexity of our implemented distribution, both with and without mathematical optimization. The findings obtained demonstrate the effectiveness of using the TMS320C6713 DSK board to design a high-resolution auto-tuned time-frequency signal analyzer. We not only achieved a perfect match with the results obtained from MATLAB, but the optimized approach also reduced runtime by approximately 19 % to 47 % compared to the direct method, depending on the input signal length and the number of loops required to optimize the Stankovic measure. A comparative analysis was also conducted to assess the effectiveness of our approach in relation to other linear and quadratic TF analyzers, including those implemented on field-programmable gate arrays (FPGAs).</div></div>","PeriodicalId":49815,"journal":{"name":"Microprocessors and Microsystems","volume":"113 ","pages":"Article 105141"},"PeriodicalIF":1.9,"publicationDate":"2025-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143179736","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An adaptive binary classifier for highly imbalanced datasets on the Edge 边缘高度不平衡数据集的自适应二元分类器
IF 1.9 4区 计算机科学
Microprocessors and Microsystems Pub Date : 2024-11-01 DOI: 10.1016/j.micpro.2024.105120
V. Hurbungs , T.P. Fowdur , V. Bassoo
{"title":"An adaptive binary classifier for highly imbalanced datasets on the Edge","authors":"V. Hurbungs ,&nbsp;T.P. Fowdur ,&nbsp;V. Bassoo","doi":"10.1016/j.micpro.2024.105120","DOIUrl":"10.1016/j.micpro.2024.105120","url":null,"abstract":"<div><div>Edge machine learning brings intelligence to low-power devices at the periphery of a network. By running machine learning algorithms on the Edge, classification can be performed faster without the need to transmit large data volumes across a network. However, on-device training is often not feasible since Edge devices have limited computing and storage resources. Improved, Scalable, Efficient, and Fast classifieR (iSEFR) is a classifier that performs both training and testing on low-power devices using linearly separable balanced datasets. The novelty of this work is the improvement of the iSEFR accuracy by fine-tuning the algorithm with datasets having an uneven class distribution. Three adaptive linear function transformation techniques were proposed to improve the decision threshold which is in the form of a linear function. Experiments using stratified sampling with 5-fold cross-validation demonstrate that one of the proposed techniques significantly improved F1-score, Recall and Matthews Correlation Coefficient (MCC) by an average of 23 %, 35 % and 21 % compared to iSEFR. Further evaluation of this technique in a Fog environment using highly imbalanced datasets such as credit card fraud, network intrusion and diabetic retinopathy also showed a significant increase of 38 %, 44 % and 30 % in F1-score, Recall and MCC with a Precision of 97 %. The adaptive binary classifier maintained the time complexity of iSEFR without altering the class imbalance.</div></div>","PeriodicalId":49815,"journal":{"name":"Microprocessors and Microsystems","volume":"111 ","pages":"Article 105120"},"PeriodicalIF":1.9,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142661032","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Algorithms for scheduling CNNs on multicore MCUs at the neuron and layer levels 多核 MCU 神经元和层级 CNN 调度算法
IF 1.9 4区 计算机科学
Microprocessors and Microsystems Pub Date : 2024-11-01 DOI: 10.1016/j.micpro.2024.105107
Petr Dobiáš , Thomas Garbay , Bertrand Granado , Khalil Hachicha , Andrea Pinna
{"title":"Algorithms for scheduling CNNs on multicore MCUs at the neuron and layer levels","authors":"Petr Dobiáš ,&nbsp;Thomas Garbay ,&nbsp;Bertrand Granado ,&nbsp;Khalil Hachicha ,&nbsp;Andrea Pinna","doi":"10.1016/j.micpro.2024.105107","DOIUrl":"10.1016/j.micpro.2024.105107","url":null,"abstract":"<div><div>Convolutional neural networks (CNNs) are progressively deployed on embedded systems, which is challenging because their computational and energy requirements need to be satisfied by devices with limited resources and power supplies. For instance, they can be implemented in the Internet of Things or edge computing, i.e., in applications using low-power and low-performance microcontroller units (MCUs). Monocore MCUs are not tailored to respond to the computational and energy requirements of CNNs due to their limited resources, but a multicore MCU can overcome these limitations. This paper presents an empirical study analysing three algorithms for scheduling CNNs on embedded systems at two different levels (neuron and layer levels) and evaluates their performance in terms of makespan and energy consumption using six neural networks, both in general and in the case of CubeSats. The results show that the <span>SNN</span> algorithm outperforms the other two algorithms (<span>STD</span> and <span>STS</span>) and that scheduling at the layer level significantly reduces the energy consumption. Therefore, embedded systems based on multicore MCUs are suitable for executing CNNs, and they can be used, for example, on board small satellites called CubeSats.</div></div>","PeriodicalId":49815,"journal":{"name":"Microprocessors and Microsystems","volume":"111 ","pages":"Article 105107"},"PeriodicalIF":1.9,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142552966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Quality-driven design of deep neural network hardware accelerators for low power CPS and IoT applications 面向低功耗 CPS 和物联网应用的深度神经网络硬件加速器的质量驱动设计
IF 1.9 4区 计算机科学
Microprocessors and Microsystems Pub Date : 2024-11-01 DOI: 10.1016/j.micpro.2024.105119
Yahya Jan, Lech Jóźwiak
{"title":"Quality-driven design of deep neural network hardware accelerators for low power CPS and IoT applications","authors":"Yahya Jan,&nbsp;Lech Jóźwiak","doi":"10.1016/j.micpro.2024.105119","DOIUrl":"10.1016/j.micpro.2024.105119","url":null,"abstract":"<div><div>This paper presents the results of our analysis of the main problems that have to be solved in the design of highly parallel high-performance accelerators for Deep Neural Networks (DNNs) used in low power Cyber–Physical System (CPS) and Internet of Things (IoT) devices, in application areas such as smart automotive, health and smart services in social networks (Facebook, Instagram, X/Twitter, etc.). Our analysis demonstrates that to arrive a to high-quality DNN accelerator architecture, complex mutual trade-offs have to be resolved among the accelerator micro- and macro-architecture, and the corresponding memory and communication architectures, as well as among the performance, power consumption and area. Therefore, we developed a multi-processor accelerator design methodology involving an automatic design-space exploration (DSE) framework that enables a very efficient construction and analysis of DNN accelerator architectures, as well as an adequate trade-off exploitation. To satisfy the low power demands of IoT devices, we extend our quality-driven model-based multi-processor accelerator design methodology with some novel power optimization techniques at the Processor’s and memory exploration stages. Our proposed power optimization techniques at the processor’s exploration stage achieve up to 66.5% reduction in power consumption, while our proposed data reuse techniques avoid up to 85.92% of redundant memory accesses thereby reducing the power consumption of accelerator necessary for low-power IoT applications. Currently, we are beginning to apply this methodology with the proposed power optimization techniques to the design of low-power DNN accelerators for IoT applications.</div></div>","PeriodicalId":49815,"journal":{"name":"Microprocessors and Microsystems","volume":"111 ","pages":"Article 105119"},"PeriodicalIF":1.9,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142661033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Lower the RISC: Designing optical-probing-attack-resistant cores 降低 RISC:设计抗光攻击的内核
IF 1.9 4区 计算机科学
Microprocessors and Microsystems Pub Date : 2024-11-01 DOI: 10.1016/j.micpro.2024.105121
Sajjad Parvin , Sallar Ahmadi-Pour , Chandan Kumar Jha , Frank Sill Torres , Rolf Drechsler
{"title":"Lower the RISC: Designing optical-probing-attack-resistant cores","authors":"Sajjad Parvin ,&nbsp;Sallar Ahmadi-Pour ,&nbsp;Chandan Kumar Jha ,&nbsp;Frank Sill Torres ,&nbsp;Rolf Drechsler","doi":"10.1016/j.micpro.2024.105121","DOIUrl":"10.1016/j.micpro.2024.105121","url":null,"abstract":"<div><div>Recently, a new Side-Channel Analysis (SCA)-based attack, namely the Optical Probing (OP) attack, has been shown to bypass the implemented protection mechanisms on the chip, allowing unauthorized access to confidential information such as stored security keys or Intellectual Property (IP). Several countermeasures against the OP attack exist, which require changes in the chip’s fabrication process, i.e., chip fabrication using OP-resistant materials, resulting in increased fabrication costs. On the other hand, other countermeasures are implemented at the layout level. These countermeasures suffer from a significant drop in performance due to the utilization of custom logic cells. Additionally, available techniques against OP at the layout level require a layout design of the logic cell library from scratch which is a time-consuming process. In this work, we mitigate these limitations and propose a methodology to design high-performance OP-attack-resistant circuits. Using a two-folded methodology, we achieve an OP attack-resistant circuit. Firstly, we design a high-performance, and Low optical Leakage-Dual Rail Logic (LoL-DRL) cell library based on a standard CMOS logic cell library. Hence, no complete redesign of the layout is required. Secondly, we propose a streamlined synthesis technique to synthesize OP-attack-resistant circuits from the original circuit’s netlist. Thus, our method seamlessly integrates into the existing synthesis flow. On top of that, we analyzed the optical leakage information of several logic cells from both the standard logic cell library and our proposed LoL-DRL logic cell library against the OP attack. We used a metric called Optical Leakage Value (OLV) to report the robustness of a logic cell against the OP attack. Furthermore, as a case study, we applied our design methodology to an open-source RISC-V core to design the first OP-attack-resistant RISC-V core, named <em>Lo-RISK</em>. Our approach minimizes any adverse impact on performance yet incurs significant expenses in terms of both area and power consumption, which is acceptable for an OP-secure end product. On average, our proposed LoL-DRL logic cell library exhibits <span><math><mrow><mn>2</mn><mo>×</mo></mrow></math></span> less information leakage through OP compared to the standard CMOS logic cell library. Our approach to designing OP-resistant circuits result in <span><math><mrow><mn>2</mn><mo>×</mo></mrow></math></span> the area and a <span><math><mrow><mn>1</mn><mo>.</mo><mn>36</mn><mo>×</mo></mrow></math></span> power increase while operating at the same frequency in comparison to a circuit designed using a standard CMOS logic cell library.</div></div>","PeriodicalId":49815,"journal":{"name":"Microprocessors and Microsystems","volume":"111 ","pages":"Article 105121"},"PeriodicalIF":1.9,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142661034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Low-cost constant time signed digit selection for most significant bit first multiplication 低成本恒定时间有符号数位选择,用于最显著位首数乘法
IF 1.9 4区 计算机科学
Microprocessors and Microsystems Pub Date : 2024-11-01 DOI: 10.1016/j.micpro.2024.105118
Ghassem Jaberipur , Saeid Gorgin , Jeong-A. Lee
{"title":"Low-cost constant time signed digit selection for most significant bit first multiplication","authors":"Ghassem Jaberipur ,&nbsp;Saeid Gorgin ,&nbsp;Jeong-A. Lee","doi":"10.1016/j.micpro.2024.105118","DOIUrl":"10.1016/j.micpro.2024.105118","url":null,"abstract":"<div><div>Serial binary multiplication is frequently used in many digital applications. In particular, left-to-right (aka online) manipulation of operands promotes the real-time generation of product digits for immediate utilization in subsequent online computations (e.g., successive layers of a neural network). In the left-to-right arithmetic operations, where a residual is maintained for digit selection, utilization of a redundant number system for the representation of outputs is mandatory, while the input operands and the residual may be redundant or non-redundant. However, when the input data paths are narrow (e.g., eight bits as in BFloat16), conventional non-redundant representations of inputs and residual provide some advantages. For example, the immediate and costless sign detection of the residual that is necessary for the next digit selection; a property not shared by redundant numbers. Nevertheless, digit selection, as practiced in the previous realizations, with both redundant and non-redundant inputs and/or residual, is slow and rather complex. Therefore, in this paper, we offer an imprecise, but faster digit selection scheme, with the required correction in the next cycle. Analytical evaluations and synthesis of the proposed circuits on FPGA platform, shows 30 % speedup and less cost with respect to both cases with redundant and non-redundant inputs and residual.</div></div>","PeriodicalId":49815,"journal":{"name":"Microprocessors and Microsystems","volume":"111 ","pages":"Article 105118"},"PeriodicalIF":1.9,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142578683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Retraction notice to “A Hybrid Semantic Similarity Measurement for Geospatial Entities” [Microprocessors and Microsystems 80 (2021) 103526] 地理空间实体的混合语义相似性测量 "的撤稿通知 [Microprocessors and Microsystems 80 (2021) 103526]
IF 1.9 4区 计算机科学
Microprocessors and Microsystems Pub Date : 2024-10-20 DOI: 10.1016/j.micpro.2024.105117
{"title":"Retraction notice to “A Hybrid Semantic Similarity Measurement for Geospatial Entities” [Microprocessors and Microsystems 80 (2021) 103526]","authors":"","doi":"10.1016/j.micpro.2024.105117","DOIUrl":"10.1016/j.micpro.2024.105117","url":null,"abstract":"","PeriodicalId":49815,"journal":{"name":"Microprocessors and Microsystems","volume":"111 ","pages":"Article 105117"},"PeriodicalIF":1.9,"publicationDate":"2024-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142531709","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信