Integration-The Vlsi Journal最新文献

筛选
英文 中文
Optimal design of mixed dielectric coaxial-annular TSV using GWO algorithm based on artificial neural network 使用基于人工神经网络的 GWO 算法优化设计混合介质同轴环形 TSV
IF 1.9 3区 工程技术
Integration-The Vlsi Journal Pub Date : 2024-05-08 DOI: 10.1016/j.vlsi.2024.102205
Liwen Zhang, He Yang, Chen Yang, Jincan Zhang, Jinchan Wang
{"title":"Optimal design of mixed dielectric coaxial-annular TSV using GWO algorithm based on artificial neural network","authors":"Liwen Zhang,&nbsp;He Yang,&nbsp;Chen Yang,&nbsp;Jincan Zhang,&nbsp;Jinchan Wang","doi":"10.1016/j.vlsi.2024.102205","DOIUrl":"https://doi.org/10.1016/j.vlsi.2024.102205","url":null,"abstract":"<div><p>The single-objective and single-parameter optimization method is commonly used in the structure optimization of TSV to improve the transmission characteristics, for which a structure design scheme that simultaneously satisfies multiple target requirements is difficult to obtain. Moreover, the method cannot simultaneously optimize different design parameters. Aiming at the above problems, a global optimization method based on the grey wolf optimization (GWO) algorithm and artificial neural network (ANN) model is proposed. With the presented mixed dielectric coaxial-annular TSV model, firstly six key design parameters A-F are selected as optimization variables by the control variable method. The L<sub>25</sub>(5<sup>6</sup>) orthogonal experiment is designed for Taguchi analysis and analysis of variance (ANOVA). Then, three prediction models, ANN, support vector machine (SVM), and extreme learning machine (ELM), are developed with the extended orthogonal data as the training sets. It is found that the ANN model performed best. To search for the global optimal solution, the genetic algorithm (GA) and GWO algorithm, combined with the ANN model are applied, respectively. The results show that the GWO algorithm is more successful in solving the problem of falling into the local optimum than GA, and the convergence speed is faster and more stable. After GWO-ANN optimization, the performance of each <em>S</em>-parameter index is greatly improved, <em>S</em><sub>11</sub> reduces by 14.05 dB, <em>S</em><sub>21</sub> increases by 0.33 dB, and <em>S</em><sub>31</sub> reduces by 12.50 dB at 30 GHz.</p></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":"97 ","pages":"Article 102205"},"PeriodicalIF":1.9,"publicationDate":"2024-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140948440","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Design of CMOS fully differential multipath two-stage OTA with boosted slew rate and power efficiency 设计可提高压摆率和能效的 CMOS 全差分多路径两级 OTA
IF 1.9 3区 工程技术
Integration-The Vlsi Journal Pub Date : 2024-05-07 DOI: 10.1016/j.vlsi.2024.102204
Zahra Hashemi, Mostafa Yargholi
{"title":"Design of CMOS fully differential multipath two-stage OTA with boosted slew rate and power efficiency","authors":"Zahra Hashemi,&nbsp;Mostafa Yargholi","doi":"10.1016/j.vlsi.2024.102204","DOIUrl":"https://doi.org/10.1016/j.vlsi.2024.102204","url":null,"abstract":"<div><p>A CMOS fully differential multipath two-stage operational transconductance amplifier (OTA) with boosted slew rate and power efficiency is proposed in this paper. The new OTA consists of two gain stages. The basic structure of the proposed OTA is the recycling folded cascode (RFC) structure. By using the multipath technique in the first stage of the proposed OTA, it leads to an increase in gain and a decrease in power consumption. In addition, a high-speed current mirror is applied to increase the phase margin. The second stage with a class-AB amplifier is used to increase the transconductance and slew rate of the output. Moreover, the power efficiency of the proposed OTA is boosted compared to the recycling double-folded cascode (RDFC) OTA. This makes the proposed OTA more appropriate for applications that require low power consumption, such as neural amplifiers. Design and simulation of the proposed OTA is done in 0.18 μm standard CMOS technology with a 1 V supply voltage. Post-layout simulation results of the proposed OTA demonstrate that the OTA dissipates 180 nW of power, while showing a 136.7 dB voltage gain, and 127.1 kHz unity gain frequency for a capacitive load of 30 pF. Thus, compared to the RDFC OTA, the proposed OTA provides a 250 % increase in slew rate and a 20 % increase in PSRR and CMRR, while power consumption is reduced by 10 %. The proposed OTA is robust against process, voltage, and temperature (PVT) variations. The recommended OTA achieves a good figure of merit (FOM) over the previous OTAs.</p></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":"97 ","pages":"Article 102204"},"PeriodicalIF":1.9,"publicationDate":"2024-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140918517","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Neuro-inspired hardware solutions for high-performance computing: A TiO2-based nano-synaptic device approach with backpropagation 高性能计算的神经启发硬件解决方案:基于 TiO2 的纳米突触设备与反向传播方法
IF 1.9 3区 工程技术
Integration-The Vlsi Journal Pub Date : 2024-05-07 DOI: 10.1016/j.vlsi.2024.102206
Yildiran Yilmaz , Fatih Gül
{"title":"Neuro-inspired hardware solutions for high-performance computing: A TiO2-based nano-synaptic device approach with backpropagation","authors":"Yildiran Yilmaz ,&nbsp;Fatih Gül","doi":"10.1016/j.vlsi.2024.102206","DOIUrl":"https://doi.org/10.1016/j.vlsi.2024.102206","url":null,"abstract":"<div><p>Computer-based machine learning algorithms that produce impressive performance results are computationally demanding and thus subject to high energy consumption during training and testing. Therefore, compact neuro-inspired devices are required to achieve efficiency in hardware resource consumption for the smooth implementation of neural network applications that require low energy and area. In this paper, learning characteristics and performances of the nanoscale titanium dioxide (<span><math><msub><mrow><mi>TiO</mi></mrow><mrow><mn>2</mn></mrow></msub></math></span>) based synaptic device have been analyzed by implementing it in the hardware-based neural network for digit classification. Our model is experimentally validated by using 32-nm CMOS technology and the results demonstrate that the model provides high computational ability with better accuracy and efficiency in resource consumption with low energy and less area. The proposed model exhibits 20% energy gain and 16.82% accuracy improvement and 18% less total latency compared to the state-of-the-art <span><math><mi>Ag</mi></math></span>:<span><math><mi>Si</mi></math></span> synaptic device-based neural network. Furthermore, when compared to the software-based (i.e., computer-based) implementation of neural networks, our <span><math><msub><mrow><mi>TiO</mi></mrow><mrow><mn>2</mn></mrow></msub></math></span>-based model not only achieved an impressive accuracy rate of 90.01% on the MNIST dataset but also did so with reduced energy consumption. Consequently, our model, characterized by a low hardware implementation cost, emerges as a promising neuro-inspired hardware solution for various neural network applications. The proposed model has further demonstrated outstanding performance in experiments involving both the MNIST and Fisher’s Iris datasets. On the latter dataset, the model exhibited notable precision (94.5%), recall (91.5%), and an impressive F1-score (92.9%), accompanied by a commendable accuracy rate of 93.04%.</p></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":"97 ","pages":"Article 102206"},"PeriodicalIF":1.9,"publicationDate":"2024-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140918518","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring BTI aging effects on spatial power density and temperature profiles of VLSI chips 探索 BTI 老化对超大规模集成电路芯片空间功率密度和温度曲线的影响
IF 1.9 3区 工程技术
Integration-The Vlsi Journal Pub Date : 2024-04-30 DOI: 10.1016/j.vlsi.2024.102202
Sachin Sachdeva, Jincong Lu, Hussam Amrouch , Sheldon X.-D. Tan
{"title":"Exploring BTI aging effects on spatial power density and temperature profiles of VLSI chips","authors":"Sachin Sachdeva,&nbsp;Jincong Lu,&nbsp;Hussam Amrouch ,&nbsp;Sheldon X.-D. Tan","doi":"10.1016/j.vlsi.2024.102202","DOIUrl":"https://doi.org/10.1016/j.vlsi.2024.102202","url":null,"abstract":"<div><p>The Long-term reliability of a chip, encompassing factors like bias temperature instability (BTI), plays a substantial role in the chip's operational efficiency and overall lifespan. Most studies primarily center around performance-related aspects like delay and timing impacts, and fewer studies are performed on reliability impacts on the spatial power density and thermal profiles of the chips. In this study, we propose to investigate the BTI impacts on the spatial power density and temperature profiles of VLSI chips for the first time. We assessed the BTI aging impact on the on-chip spatial power density and temperature for two widely used circuit functional blocks (dual port RAM, Discrete Cosine Transform (DCT) block) at T = 130<sup><em>◦</em></sup>C and T = 25<sup><em>◦</em></sup>C to account for the worst-case BTI degradation, using degradation-aware cell libraries for a 10-year aging scenario. Furthermore, we showcased the essential role of BTI aging-aware timing analysis in evaluating the impact of BTI aging on total power, on-chip spatial power density, and thermal maps. Neglecting this aspect can result in a substantial underestimation of the results related to the parameters mentioned above. We developed a power map generation method from the circuit layout and power analysis from EDA tools. We demonstrate that both circuits’ maximum power density reduction is approximately 12 % and 20 %, respectively. Furthermore, to analyze the BTI impact on spatial temperature, we built the heat transfer model using a multiphysics tool to imitate a real chip (Intel i7-8650U) and performed thermal simulations to evaluate the spatial thermal map. The resulting maximum temperature reduction for both these circuits is approximately 10 % and 12 %, respectively, which is quite significant.</p><p>Our analysis has further unveiled that, in the context of a specific circuit, the position of maximum power density and the occurrence of a hot spot remains consistent over time, unaffected by aging. However, it's important to note that these positions can vary between different circuits, primarily influenced by the workload the circuit is currently handling. Furthermore, our findings demonstrate that the effects of Bias Temperature Instability (BTI) aging are significantly more pronounced when the circuit operates at higher temperatures (T = 130<sup><em>◦</em></sup>C) compared to lower operating temperatures (T = 25<sup><em>◦</em></sup>C).</p></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":"97 ","pages":"Article 102202"},"PeriodicalIF":1.9,"publicationDate":"2024-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141097293","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A rail-to-rail high speed comparator with LVDS output in 0.18-μm SiGe BiCMOS Technology 采用 0.18μm SiGe BiCMOS 技术制造的具有 LVDS 输出的轨至轨高速比较器
IF 1.9 3区 工程技术
Integration-The Vlsi Journal Pub Date : 2024-04-27 DOI: 10.1016/j.vlsi.2024.102198
Qiyan Sun , Ruiyong Tu , Jin Xie , Yihong Gong , Sini Wu , Jinghu Li , Zhicong Luo
{"title":"A rail-to-rail high speed comparator with LVDS output in 0.18-μm SiGe BiCMOS Technology","authors":"Qiyan Sun ,&nbsp;Ruiyong Tu ,&nbsp;Jin Xie ,&nbsp;Yihong Gong ,&nbsp;Sini Wu ,&nbsp;Jinghu Li ,&nbsp;Zhicong Luo","doi":"10.1016/j.vlsi.2024.102198","DOIUrl":"https://doi.org/10.1016/j.vlsi.2024.102198","url":null,"abstract":"<div><p>Achieving low propagation delay in comparators under low input overdrive voltage is challenging. To overcome this difficulty, this paper presents a novel rail-to-rail high-speed comparator. By clamping the output node of the current summation circuit relative to a fixed level <span><math><msub><mrow><mi>V</mi></mrow><mrow><mi>C</mi></mrow></msub></math></span>, the overdrive recovery time under large signal is successfully reduced. Moreover,by adopting a cascaded approach with multiple stages of high bandwidth and low gain,not only is the comparator’s gain enhanced,but it also acquires higher bandwidth. Ultimately, the comparator’s output is transmitted at high speed through an LVDS interface. This design is implemented using <span><math><mrow><mn>0</mn><mo>.</mo><mn>18</mn><mspace></mspace><mi>μ</mi><mi>m</mi></mrow></math></span> SiGe BiCMOS technology. Simulation results show that the comparator has a static power consumption of 26.4 mW, and for 5 mV input overdrive, the average propagation delay is about 1.09 ns.</p></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":"97 ","pages":"Article 102198"},"PeriodicalIF":1.9,"publicationDate":"2024-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140816071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A non-degenerate n-dimensional integer domain chaotic map model with application to PRNG 应用于 PRNG 的非退化 n 维整数域混沌图模型
IF 1.9 3区 工程技术
Integration-The Vlsi Journal Pub Date : 2024-04-25 DOI: 10.1016/j.vlsi.2024.102200
Mengdi Zhao, Hongjun Liu
{"title":"A non-degenerate n-dimensional integer domain chaotic map model with application to PRNG","authors":"Mengdi Zhao,&nbsp;Hongjun Liu","doi":"10.1016/j.vlsi.2024.102200","DOIUrl":"10.1016/j.vlsi.2024.102200","url":null,"abstract":"<div><p>To address the limitations of existing chaotic maps, we proposed a non-degenerate <em>n</em>-dimensional (<em>n</em> ≥ 2) integer domain chaotic map (<em>n</em>D-IDCM) model that can construct any non-degenerate <em>n</em>-dimensional integer domain chaotic maps. Moreover, we analyzed its chaotic behavior through Lyapunov exponent, and found that the <em>n</em>D-IDCM generates chaotic sequences in the integer domain, which effectively resolves the issue of finite precision effect when implementing existing chaotic maps on computers or digital devices. To verify the effectiveness of <em>n</em>D-IDCM, we presented two instances to demonstrate how the positive Lyapunov exponents can be regulated by manipulating the parameter matrix. Subsequently, we have scrutinized their dynamical behavior using Kolmogorov entropy, sample entropy, correlation dimension and randomness testing via TestU01. Finally, to assess the feasibility of <em>n</em>D-IDCM, we devised a keyed pseudo random number generator (PRNG) based on a 3D-IDCM that can ensure superior randomness and unpredictability. Experimental results indicated that integer domain chaotic maps constructed using <em>n</em>D-IDCM have desirable Lyapunov exponents and exhibit ergodicity within a sufficient larger chaotic range.</p></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":"97 ","pages":"Article 102200"},"PeriodicalIF":1.9,"publicationDate":"2024-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140775056","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Design of novel low cost triple-node-upset self-recoverable hardened latch 设计新型低成本三节点嵌入式可自动恢复加固闩锁
IF 1.9 3区 工程技术
Integration-The Vlsi Journal Pub Date : 2024-04-24 DOI: 10.1016/j.vlsi.2024.102199
Hui Xu , Shuo Zhu , Ruijun Ma , Zhengfeng Huang , Huaguo Liang , Haojie Sun , Chaoming Liu
{"title":"Design of novel low cost triple-node-upset self-recoverable hardened latch","authors":"Hui Xu ,&nbsp;Shuo Zhu ,&nbsp;Ruijun Ma ,&nbsp;Zhengfeng Huang ,&nbsp;Huaguo Liang ,&nbsp;Haojie Sun ,&nbsp;Chaoming Liu","doi":"10.1016/j.vlsi.2024.102199","DOIUrl":"https://doi.org/10.1016/j.vlsi.2024.102199","url":null,"abstract":"<div><p>CMOS devices are increasingly affected by triple-node-upset as transistor characteristics reduce, particularly in radiation environments. For the shortcomings of the existing radiation hardened designs, including high overhead and high delay, this paper proposes a novel low cost triple-node-upset self-recoverable latch. Simulation results show that compared with the existing triple-node-upset hardened designs, the proposed latch has reduced power consumption, delay, and power-delay product by 34.57 %, 6.42 %, and 34.98 %, respectively.</p></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":"97 ","pages":"Article 102199"},"PeriodicalIF":1.9,"publicationDate":"2024-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140650477","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Ensemble learning model for effective thermal simulation of multi-core CPUs 多核 CPU 有效热模拟的集合学习模型
IF 1.9 3区 工程技术
Integration-The Vlsi Journal Pub Date : 2024-04-24 DOI: 10.1016/j.vlsi.2024.102201
Lin Jiang , Anthony Dowling, Yu Liu, Ming-C. Cheng
{"title":"Ensemble learning model for effective thermal simulation of multi-core CPUs","authors":"Lin Jiang ,&nbsp;Anthony Dowling,&nbsp;Yu Liu,&nbsp;Ming-C. Cheng","doi":"10.1016/j.vlsi.2024.102201","DOIUrl":"10.1016/j.vlsi.2024.102201","url":null,"abstract":"<div><p>An ensemble data-learning approach based on proper orthogonal decomposition (POD) and Galerkin projection (EnPOD-GP) is proposed for thermal simulations of multi-core CPUs to improve training efficiency and the model accuracy for a previously developed global POD-GP method (GPOD-GP). GPOD-GP generates one set of basis functions (or POD modes) to account for thermal behavior in response to variations in dynamic power maps (PMs) in the entire chip, which is computationally intensive to cover possible variations of all power sources. EnPOD-GP however acquires multiple sets of POD modes to significantly improve training efficiency and effectiveness, and its simulation accuracy is independent of any dynamic PM. Compared to finite element simulation, both GPOD-GP and EnPOD-GP offer a computational speedup over 3 orders of magnitude. For a processor with a small number of cores, GPOD-GP provides a more efficient approach. When high accuracy is desired and/or a processor with more cores is involved, EnPOD-GP is more preferable in terms of training effort and simulation accuracy and efficiency. Additionally, the error resulting from EnPOD-GP can be precisely predicted for any random spatiotemporal power excitation.</p></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":"97 ","pages":"Article 102201"},"PeriodicalIF":1.9,"publicationDate":"2024-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0167926024000658/pdfft?md5=1bfea626d6bed7a5cf9433aa649eaf0a&pid=1-s2.0-S0167926024000658-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140783197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Co-design based FPGA implementation of an efficient new speech hyperchaotic cryptosystem in the transform domain 基于协同设计的 FPGA 实现变换域高效新型语音超混沌密码系统
IF 1.9 3区 工程技术
Integration-The Vlsi Journal Pub Date : 2024-04-23 DOI: 10.1016/j.vlsi.2024.102197
Mohamed Salah Azzaz, Redouane Kaibou, Bachir Madani
{"title":"Co-design based FPGA implementation of an efficient new speech hyperchaotic cryptosystem in the transform domain","authors":"Mohamed Salah Azzaz,&nbsp;Redouane Kaibou,&nbsp;Bachir Madani","doi":"10.1016/j.vlsi.2024.102197","DOIUrl":"https://doi.org/10.1016/j.vlsi.2024.102197","url":null,"abstract":"<div><p>In this paper a new encryption system has been designed and implemented for real-time speech transmission to reduce bandwidth requirements, increase security and minimize residual intelligibility. To guarantee robustness and lightweight computation, the developed cryptosystem has been carried out in the wavelet transform domain based on a hyperchaotic model to generate mask and permutation keys. The cryptographic system has been designed using a hardware-software (HW/SW) co-design approach by developing several IP-cores in a relatively short development time. The performances and security evaluation of the system have been validated through simulation results followed by an experimental validation through the implementation of an encrypted speech signal transmission between two low cost Nexys-4 DDR FPGA platforms, operating in real-time for both wired and wireless communications. Compared to similar works, high performances have been obtained in terms of bandwidth efficiency due to the use of DWT, limited area of FPGA resources, low power consumption and high security level with a large keyspace that is sufficient to resist against brute force attacks. The designed system can be a very useful solution for many real-time secure integrated voice communication systems, multiple communication purposes, military, professional or personal high level of conversations security.</p></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":"97 ","pages":"Article 102197"},"PeriodicalIF":1.9,"publicationDate":"2024-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140650478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimizing code allocation for hybrid on-chip memory in IoT systems 优化物联网系统中混合片上存储器的代码分配
IF 1.9 3区 工程技术
Integration-The Vlsi Journal Pub Date : 2024-04-20 DOI: 10.1016/j.vlsi.2024.102195
Zhe Sun , Zimeng Zhou , Fang-Wei Fu
{"title":"Optimizing code allocation for hybrid on-chip memory in IoT systems","authors":"Zhe Sun ,&nbsp;Zimeng Zhou ,&nbsp;Fang-Wei Fu","doi":"10.1016/j.vlsi.2024.102195","DOIUrl":"10.1016/j.vlsi.2024.102195","url":null,"abstract":"<div><p>With the increasing application of IoT devices, the memory subsystem, as the performance and energy bottleneck of IoT systems, has received a lot of attention. One of the keys is on-chip memory which can bridge the performance gap between the CPU and main memory. While many off-the-shelf embedded processors utilize the hybrid on-chip memory architecture containing scratchpad memories (SPMs) and caches, most existing literature ignores the collaboration between caches and SPMs. This paper proposes static SPM allocation strategies for the architecture mentioned above in IoT systems, which try to minimize the overall instruction memory subsystem latency and/or energy consumption. We capture the intra- and inter-task cache conflict misses via a fine-grained temporal cache behavior model. Based on this cache conflict information, we propose an integer linear programming (ILP) algorithm to generate an optimal static function level SPM allocation for system performance. Furthermore, to improve the scalability of the proposed allocation scheme for an enormous task set, we offer the interference factor to calculate the interference impact quantitatively. Then, based on the interference factor, we present two approximate knapsack based heuristic algorithms to provide near optimal static allocation schemes at both function- and basic block-level granularities, which favors fast design space exploration. The experiment results demonstrate that the proposed solution achieves a 30.85% improvement in memory performance, and up to 31.39% reduction in energy consumption, compared to the existing SPM allocation scheme at the function level. In addition, the proposed basic block level allocation algorithm shows better performance than our function level allocation algorithm and other basic block level allocation algorithm.</p></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":"97 ","pages":"Article 102195"},"PeriodicalIF":1.9,"publicationDate":"2024-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140794438","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信